'Tesseract OCR Can't create .traineddata
The Problem:
I followed the step by step tutorial provided here to train my tesseract ocr for a new font. But on step 5 and 6 not all needed files are created.
What I did:
My image file is: en.va.exp0.tif
Step 1: Creating the .box file + correcting wrongly identified characters
tesseract en.va.exp0.jpg en.va.exp0 batch.nochop makebox
Step 2: Creating .tr file
tesseract en.va.exp0.tif en.va.exp0 box.train
Step 3: Extracting the charset from the box files
unicharset_extractor en.va.exp0.box
Step 4: Create font_properties file
echo "va 0 0 1 0 0" > font_properties
Step 5: Training the data
mftraining -F font_properties -U unicharset -O en.unicharset en.va.exp0.tr
Step 6: Training the data
cntraining en.va.exp0.tr
As far as I know step 5 should create 4 files: shapetable, inttemp, pffmtable, normproto. But only the shapetable file is created. Because of that step 6 also doesn't work (it simply does nothing i think)
Materials:
explorer-screenshot-before.jpg
If more explanation or material is needed I'll add it and thanks in advance
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
