'Tesseract OCR doesn't recognize some symbols such as the ^ Circumflex
I've been trying to use Tesseract to recognize texts that have the circumflex ^ or in other words the power symbol. Tesseract never recognized it in any of the documents. I've tried to include the Greek language because maybe it's supported there, but it didn't work. I've also gone through the official issues posted on Github, but nothing there.
Is there any workaround? Any help is greatly appreciated!
Solution 1:[1]
There is a 'language pack' for equations. The symbol may be included there. The file is named 'equ.traineddata'. I got it from here: https://github.com/tesseract-ocr/tessdata
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Matevz |
