Processing pipeline Pixelnetica™ Document Scanning SDK for Apple iOS

Whichever level you work at, a document moves through the same stages. Understanding them makes the rest of the documentation easy to navigate, because each task guide covers one stage.

 acquire ─▶ detect ─▶ crop / correct ─▶ enhance ─▶ recognise text ─▶ export
 (camera     (find       (perspective    (colour     (OCR, optional)   (PDF / TIFF /
  or import)  corners)    correction)     profile)                      JPEG / PNG)

The stages

StageWhat happensCore engineReady-to-use screen
AcquireCapture with the camera, or import an existing photoPxFrameObserver (live frames) / PxPicture (import)PxUiCameraScreen
DetectFind the document’s four corners (the cutout)PxCutout(part of the camera screen)
Crop / correctUse the cutout to crop and flatten perspective into a rectangular pagePxRefineFeaturesPxPicture.refinePxUiPageCropScreen
EnhanceApply a scan-style colour profile (B&W / greyscale / colour)PxRefineFeatures (use…forPic:) + PxColorProfile(part of the crop editor)
Recognise textOptional OCR; produce text with positionsPxTextReaderPxTextResultPxUiOcrEditorScreen
ExportWrite the page as PDF, TIFF, JPEG, or PNG — including a searchable PDF with an invisible OCR text layerPxImageWriter (+ PxPaper)(no dedicated screen)

How the two levels map onto it

  • The ready-to-use screens package several stages into one presentation: the camera screen handles acquire + detect; the crop editor handles crop + enhance; the OCR editor handles recognise-text review.
  • The core engine exposes each stage as an API you call in sequence, so a fully custom flow assembles the same stages itself.
  • Export is always the engine. There is no ready-to-use export screen — you write the final file with PxImageWriter after the user is happy with the page.

A typical app uses the camera screen to acquire and detect, the crop editor to let the user fine-tune, optionally the OCR editor, and then calls the engine to export.

See also

Top