'Click on Pytesseract string based on list

Basically, I already have all the string extraction down, but now, I'm stuck on what I should to to find a specific string in the image based on a list and click on it's coordinates.

My output goes like this:

TORONTO
MONTREAL
CALGARY
VANCOUVER
OTTAWA
QUEBEC

My list goes like this:

QUEBEC
MONTREAL
OTTAWA
TORONTO
CALGARY
VANCOUVER

And I would click on QUEBEC, if present. If not, if present, I'd click on MONTREAL, and so on.

I've searched far and wide on google, and I can't find anything helpful on working on strings extracted with pytesseract or using pytesseract's coordinates.

Here is my code:

import pytesseract
import cv2
from PIL import ImageGrab
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

f = open("output.txt", "w")

bbox = (240, 390, 710, 1250)
img = ImageGrab.grab(bbox)
img.save("Capture.jpg")
img.close()

# Grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread('Capture.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Morph open to remove noise
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)

# Perform text extraction
data = pytesseract.image_to_string(opening, lang='eng', config='--psm 6')
print(data)

print(pytesseract.image_to_boxes(opening), file=f)

f.close()

#cv2.imshow('blur', blur)
#cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.waitKey()

Any help is greatly appreciated!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source