OCR & text recognition Pixelnetica™ Document Scanning SDK for Apple iOS

Optical character recognition (OCR) extracts the text from a processed page. The recognised text — with its layout and per-element positions — is what later lets you export a searchable PDF.

Recognise text with the engine

A PxTextReader is created with the path to the OCR language data and the language(s) to use, then run against a PxPicture. The result lands on the picture’s scanResult:

import DocScanningSDK

let reader = PxTextReader(traineddataDirectory, languages: "eng")
reader.scanText(picture)

if let result = picture.scanResult, result.status == PxScanStatus_Recognized {
    let text = result.text                 // the recognised text
    let languages = result.languages       // languages actually used
}
  • Language data. traineddataDirectory is a folder of OCR language files. The bundled offline data and on-demand language packs are covered in OCR languages; pass multiple languages joined with + (for example "eng+deu").
  • Result detail. PxTextResult exposes the full text plus PxTextAttribute elements (blocks, lines, words, symbols) you can walk for positions and confidence.

Report progress and allow cancel

Long scans should be cancellable and show progress. Assign a progressCallback (a PxTextReaderProgressCallback) before scanning — it receives per-page progress and can cancel an in-flight scan.

Ready-to-use: the OCR editor

PxUiOcrEditorScreen (SwiftUI PxUiOcrEditorScreenView) presents the recognised text over the page for the user to review and correct. It takes a PxUiOcrEditorScreenConfiguration and a PxUiOcrEditorSession (built from the page image and its PxTextResult), and calls back with the edited text. Combine it with the language picker.

Tips

  • Check result.statusPxScanStatus_Recognized means success; PxScanStatus_NotFound or PxScanStatus_Cancelled need handling.
  • OCR is a licensed feature; confirm the OCR feature is enabled on your license.

See also

Top