Introduction Pixelnetica™ Document Scanning SDK for Apple iOS
The Pixelnetica Document Scanning SDK (DSSDK) turns photos of paper documents into clean, deskewed pages that look like they came from a flatbed scanner — and, optionally, into searchable PDFs.
Everything runs on the device: capture, detection, image processing, and text recognition need no network connection.
If you are building an app that scans receipts, contracts, IDs, whiteboards, or any paper document with the camera, DSSDK gives you the whole pipeline rather than a single piece of it.
What you can build
- Capture with live guidance — a camera screen that detects the document’s edges in real time and can shoot automatically when the framing is good.
- Detect and crop — find the document’s four corners and correct perspective, so a photo taken at an angle becomes a flat, rectangular page.
- Enhance — apply scan-style colour profiles (black-and-white, greyscale, colour) that sharpen text and flatten uneven lighting.
- Recognise text (OCR) — extract text from a page and export a searchable PDF with an invisible text layer.
- Export — write the result as PDF, TIFF, JPEG, or PNG, including multi-page PDFs.
Two products: ready-to-use UI, or the core engine
DSSDK ships as two Swift Package products. Most apps start with the first:
DocScanningSDK-UI— batteries-included screens you present in a few lines: a camera scanner, a crop/borders editor, an OCR results editor, and a language picker. The SDK owns the UI; you handle the result.DocScanningSDK— the core engine (image processing, detection, OCR, export) with no UI. Use it directly when you want to build a fully custom scanning experience.
A good rule of thumb: start with the ready-to-use screens for a working scanner fast, and drop down to the core engine only where you need custom UI. See Core engine vs ready-to-use UI.
Why DSSDK
- One-line screens. A complete capture → crop → OCR flow you present like any other view controller or SwiftUI view.
- Offline OCR out of the box. Orientation and script detection work on first launch with no download; additional language packs are fetched on demand.
- Real searchable PDFs. Export a PDF with an embedded, invisible OCR text layer — not just an image in a PDF wrapper.
- Modern integration. Distributed exclusively via Swift Package Manager as binary
xcframeworks.
Requirements
- iOS 16.3 or later.
- Swift or Objective-C. The core API is Objective-C; the ready-to-use UI is Swift.
What’s new in version 3.0
Version 3.0 is a full rewrite of the SDK. The highlights:
- On-device OCR in 100+ languages (including right-to-left scripts), running entirely on the device, with searchable-PDF output and offline page-orientation detection.
- New PDF engine with advanced image compression — up to 90% smaller colour and greyscale pages, and lossless black-and-white — producing layered, searchable (“sandwiched”) PDFs.
- Batteries-included UI screens in
DocScanningSDK-UI: a smart document camera, a page/crop editor, an OCR results editor, and a language picker — each adoptable in a single line. - Swift Package Manager distribution as two binary frameworks, on a new Objective-C API surface (minimum iOS 16.3).
For the complete release notes and earlier releases, see the Version history.
Next steps
- Quick Start — a working scanner in about five minutes.
- Installation & SwiftPM integration — add the package and keep it up to date.
- Licensing, trial & evaluation — apply a license and understand evaluation mode.