'can't get pytesseract and cv2 to accurately OCR a screenshot from a phone (Python 3.8)

I am currently looking at making a function that reads a screenshot taken from wordle, and produces a list of the possible words. I am struggling to understand why I am not able to get an accurate OCR using CV2 and pytesseract. My first attempt (no pre-processing):

example wordle screenshot

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

img = cv2.imread(r'C:\Users\Admin\OneDrive\Pictures\attemp1.jpeg')
text = pytesseract.image_to_string(img)
print(text)

output

11:29 a 4G

=@® #£Wordle dh %

DanwE
Ptalijn|r

aA @nytimes.com @ i

Cc oo gf og Oo

In my second attempt I wanted to see if I could use some pre-processing to at least get some of the letters correct (shamelessly stolen from another post):

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread(r'C:\Users\Admin\OneDrive\Pictures\attemp1.jpeg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Morph open to remove noise and invert image
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)
invert = 255 - opening

# Perform text extraction
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 3')
print(data)
cv2.imshow('invert', invert)
cv2.waitKey()

output

11:29 o 46@

=@® Wordle &%

a Oi

° 7 oi , ie ‘ ° ° Bi
B oi ° ‘ ° oi .
ENTER Z x c Vv

AA w nytimes.com @

< ho mom ©

I feel like the OCR is trying to turn everything into a word instead of treating each characters as its own 'word' however I was unable to find any config options that allowed for this condition.

In a perfect world I would be able to return a nested list with character, position, colour. though I am just struggling with accuracy at this point. This is my first 'computer vision' task so I apologise if I have overlooked something simple. Thank you



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source