PDFix OCR Tesseract enumeration types. More...

Enumerations
enum	OcrTesseractPageSegType { kOcrSegOSDOnly, kOcrSegAutoOSD, kOcrSegAutoOnly, kOcrSegAuto, kOcrSegSingleColumn, kOcrSegSingleBlockVertText, kOcrSegSingleBlock, kOcrSegSingleLine, kOcrSegSingleWord, kOcrSegCircleWord, kOcrSegSingleChar, kOcrSegSparseText, kOcrSegSparseTextOSD, kOcrSegRawLine }
	OcrTesseractFlags. More...

enum	OcrTesseractEngineType { kOcrTesseractOnly, kOcrTesseractLSTMOnlly, kOcrTesseractLSTMCombined, kOcrTesseractDefault }
	OcrTesseractEngineType. More...

Detailed Description

PDFix OCR Tesseract enumeration types.

Enumeration Type Documentation

OcrTesseractEngineType.

When Tesseract is initialized we can choose to instantiate/load/run only the Tesseract part, only the Cube part or both along with the combiner.

Enumerator
kOcrTesseractOnly	Run Tesseract only - fastest; deprecated
kOcrTesseractLSTMOnlly	Run just the LSTM line recognizer.
kOcrTesseractLSTMCombined	Run the LSTM recognizer, but allow fallback to Tesseract when things get difficult.
kOcrTesseractDefault	Specify this mode to indicate that any of the above modes should be automatically inferred from the variables in the language-specific config

OcrTesseractFlags.

Specifies a various ocr construction flags. OcrTesseractPageSegType

Possible modes for page layout analysis.

Enumerator
kOcrSegOSDOnly	Orientation and script detection only.
kOcrSegAutoOSD	Automatic page segmentation with orientation and script detection. (OSD)
kOcrSegAutoOnly	Automatic page segmentation, but no OSD, or OCR.
kOcrSegAuto	Fully automatic page segmentation, but no OSD.
kOcrSegSingleColumn	Assume a single column of text of variable sizes.
kOcrSegSingleBlockVertText	Assume a single uniform block of vertically aligned text.
kOcrSegSingleBlock	Assume a single uniform block of text.
kOcrSegSingleLine	Treat the image as a single text line.
kOcrSegSingleWord	Treat the image as a single word.
kOcrSegCircleWord	Treat the image as a single word in a circle.
kOcrSegSingleChar	Treat the image as a single character.
kOcrSegSparseText	Find as much text as possible in no particular order.
kOcrSegSparseTextOSD	Sparse text with orientation and script det.
kOcrSegRawLine	Treat the image as a single text line, bypassing hacks that are Tesseract-specific.