PDFix SDK  6.20.0
Loading...
Searching...
No Matches
OcrTesseract Enumerations

Enumerations

enum  OcrTesseractPageSegType {
  kOcrSegOSDOnly , kOcrSegAutoOSD , kOcrSegAutoOnly , kOcrSegAuto ,
  kOcrSegSingleColumn , kOcrSegSingleBlockVertText , kOcrSegSingleBlock , kOcrSegSingleLine ,
  kOcrSegSingleWord , kOcrSegCircleWord , kOcrSegSingleChar , kOcrSegSparseText ,
  kOcrSegSparseTextOSD , kOcrSegRawLine
}
 OcrTesseractFlags. More...
 
enum  OcrTesseractEngineType { kOcrTesseractOnly , kOcrTesseractLSTMOnlly , kOcrTesseractLSTMCombined , kOcrTesseractDefault }
 OcrTesseractEngineType. More...
 

Detailed Description

PDFix OCR Tesseract enumeration types.

Enumeration Type Documentation

◆ OcrTesseractEngineType

OcrTesseractEngineType.

When Tesseract is initialized we can choose to instantiate/load/run only the Tesseract part, only the Cube part or both along with the combiner.

Enumerator
kOcrTesseractOnly 

Run Tesseract only - fastest; deprecated

kOcrTesseractLSTMOnlly 

Run just the LSTM line recognizer.

kOcrTesseractLSTMCombined 

Run the LSTM recognizer, but allow fallback to Tesseract when things get difficult.

kOcrTesseractDefault 

Specify this mode to indicate that any of the above modes should be automatically inferred from the variables in the language-specific config

◆ OcrTesseractPageSegType

OcrTesseractFlags.

Specifies a various ocr construction flags. OcrTesseractPageSegType

Possible modes for page layout analysis.

Enumerator
kOcrSegOSDOnly 

Orientation and script detection only.

kOcrSegAutoOSD 

Automatic page segmentation with orientation and script detection. (OSD)

kOcrSegAutoOnly 

Automatic page segmentation, but no OSD, or OCR.

kOcrSegAuto 

Fully automatic page segmentation, but no OSD.

kOcrSegSingleColumn 

Assume a single column of text of variable sizes.

kOcrSegSingleBlockVertText 

Assume a single uniform block of vertically aligned text.

kOcrSegSingleBlock 

Assume a single uniform block of text.

kOcrSegSingleLine 

Treat the image as a single text line.

kOcrSegSingleWord 

Treat the image as a single word.

kOcrSegCircleWord 

Treat the image as a single word in a circle.

kOcrSegSingleChar 

Treat the image as a single character.

kOcrSegSparseText 

Find as much text as possible in no particular order.

kOcrSegSparseTextOSD 

Sparse text with orientation and script det.

kOcrSegRawLine 

Treat the image as a single text line, bypassing hacks that are Tesseract-specific.