Legacy OCR Language Packs

Here are the languages already trained. Download the .traineddata file and copy it to the installation path of WordCaptureX.

If you need training for a specific font contact us for details.


Custom Trained Tesseract files:

Useful for scraping the most usual web and windows fonts.

English1
Verdana (Regular, Bold), Giorgia (Regular, Bold), Times New Roman (Regular, Bold), Trebuchet MS (Regular, Bold), Courier New (Regular, Bold), Tahoma (Regular, Bold), Ms Shell Dlg (Regular, Bold)

English2

Verdana (Regular, Bold, Italic), Giorgia (Regular, Bold, Italic), Times New Roman (Regular, Bold, Italic), Trebuchet MS (Regular, Bold, Italic), Courier New (Regular, Bold, Italic), Tahoma (Regular, Bold, Italic), Ms Shell Dlg (Regular, Bold, Italic)

English3
Verdana (Regular, Bold), Giorgia (Regular, Bold), Times New Roman (Regular, Bold), Trebuchet MS (Regular, Bold), Courier New (Regular, Bold), Tahoma (Regular, Bold), Ms Shell Dlg (Regular, Bold), Calibri(Regular, Bold)  - other font sizing technology.

English4
Verdana (Regular, Bold, Italic), Giorgia (Regular, Bold, Italic), Times New Roman (Regular, Bold, Italic), Trebuchet MS (Regular, Bold, Italic), Courier New (Regular, Bold, Italic), Tahoma (Regular, Bold, Italic), Ms Shell Dlg (Regular, Bold, Italic) - other font sizing technology

English5
Verdana (Regular, Bold, Italic), Giorgia (Regular, Bold, Italic), Times New Roman (Regular, Bold, Italic), Trebuchet MS (Regular, Bold, Italic), Courier New (Regular, Bold, Italic), Tahoma (Regular, Bold, Italic), Ms Shell Dlg (Regular, Bold, Italic), Arial (Regular, Bold, Italic), Courier (Arial, Bold, Italic), Calibri(Regular, Bold, Italic)

Language Independent Training
Verdana (Regular, Bold, Italic), Giorgia (Regular, Bold, Italic), Times New Roman (Regular, Bold, Italic), Trebuchet MS (Regular, Bold, Italic), Courier New (Regular, Bold, Italic), Tahoma (Regular, Bold, Italic), Ms Shell Dlg (Regular, Bold, Italic), Arial (Regular, Bold, Italic), Courier (Regular, Bold, Italic), Komika Parch (Regular, Bold, Italic), Myriad Pro(Regular, Bold, Italic), Komika Text (Regular, Bold, Italic), Calibri(Regular, Bold, Italic)


Standard Tesseract Training files

Chinese simplified
Chinese traditional
Danish
Dutch
English
Finnish
French
German
Greek
Hungarian
Indonesian
Italian
Japanese
Latvian
Norwegian
Polish
Portuguese
Romanian
Russian
Serbian
Slovakian
Slovenian
Spanish
Swedish
Turkish
Ukrainian
Vietnamese


SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
Comments