PDFix OCR Tesseract enumeration types.
More...
|
| enum | OcrTesseractPageSegType {
kOcrSegOSDOnly,
kOcrSegAutoOSD,
kOcrSegAutoOnly,
kOcrSegAuto,
kOcrSegSingleColumn,
kOcrSegSingleBlockVertText,
kOcrSegSingleBlock,
kOcrSegSingleLine,
kOcrSegSingleWord,
kOcrSegCircleWord,
kOcrSegSingleChar,
kOcrSegSparseText,
kOcrSegSparseTextOSD,
kOcrSegRawLine
} |
| | OcrTesseractFlags. More...
|
| |
| enum | OcrTesseractEngineType { kOcrTesseractOnly,
kOcrTesseractLSTMOnlly,
kOcrTesseractLSTMCombined,
kOcrTesseractDefault
} |
| | OcrTesseractEngineType. More...
|
| |
PDFix OCR Tesseract enumeration types.
◆ OcrTesseractEngineType
OcrTesseractEngineType.
When Tesseract is initialized we can choose to instantiate/load/run only the Tesseract part, only the Cube part or both along with the combiner.
| Enumerator |
|---|
| kOcrTesseractOnly | Run Tesseract only - fastest; deprecated
|
| kOcrTesseractLSTMOnlly | Run just the LSTM line recognizer.
|
| kOcrTesseractLSTMCombined | Run the LSTM recognizer, but allow fallback to Tesseract when things get difficult.
|
| kOcrTesseractDefault | Specify this mode to indicate that any of the above modes should be automatically inferred from the variables in the language-specific config
|
◆ OcrTesseractPageSegType
OcrTesseractFlags.
Specifies a various ocr construction flags. OcrTesseractPageSegType
Possible modes for page layout analysis.
| Enumerator |
|---|
| kOcrSegOSDOnly | Orientation and script detection only.
|
| kOcrSegAutoOSD | Automatic page segmentation with orientation and script detection. (OSD)
|
| kOcrSegAutoOnly | Automatic page segmentation, but no OSD, or OCR.
|
| kOcrSegAuto | Fully automatic page segmentation, but no OSD.
|
| kOcrSegSingleColumn | Assume a single column of text of variable sizes.
|
| kOcrSegSingleBlockVertText | Assume a single uniform block of vertically aligned text.
|
| kOcrSegSingleBlock | Assume a single uniform block of text.
|
| kOcrSegSingleLine | Treat the image as a single text line.
|
| kOcrSegSingleWord | Treat the image as a single word.
|
| kOcrSegCircleWord | Treat the image as a single word in a circle.
|
| kOcrSegSingleChar | Treat the image as a single character.
|
| kOcrSegSparseText | Find as much text as possible in no particular order.
|
| kOcrSegSparseTextOSD | Sparse text with orientation and script det.
|
| kOcrSegRawLine | Treat the image as a single text line, bypassing hacks that are Tesseract-specific.
|