Workflow Guide Pixelnetica™ Document Scanning SDK for Android
Below is the standard workflow for processing images with Pixelnetica Document Scanning SDK (DSSDK).
This guide provides instructions on integrating the Document Scanning SDK (DSSDK) into your application. Refer to the bundled sample application source code for a practical implementation.
Step 1: Open an Image and Detect Document Bounds
Prerequisites: An imageUri
obtained from the Image Picker, Gallery, or other local storage sources.
Important: Do not use an internet URL!
// Create and configure ScanPicture
val picture = ScanPicture(context, imageUri)
picture.shadows = true // Enable shadows if needed
// Detect document corners
val cutout = picture.detectCutout()
if (!cutout.isDefined) {
// In cases where document borders cannot be determined,
// consider displaying a warning to the user.
}
Step 2: Automatically Detect Picture Orientation
Prerequisites: A language directory path (languagesDir
) as described in the Setup OCR Languages section.
// Create a detector instance
val orientationDetector = ScanDetector(languagesDir)
// Determine the picture's orientation
picture.detectOrientation(orientationDetector)
Step 3: Process the Image
Execute the refine
pipeline to binarize (apply the desired color profile), crop, and rotate the image to the correct display orientation.
picture.refine(
listOf(
RefineFeature.Profile(RefineFeature.Profile.Type.Bitonal), // Perform black-and-white (bitonal) processing
RefineMode.Rectify.WithCutout(cutout), // Apply the detected cutout
RefineMode.Display() // Rotate the picture to the correct visual orientation
)
)
// Obtain the processed image
val bitmap: Bitmap = picture.createBitmap()
Step 4: Recognize Text in the Image
Prerequisites: A language directory path (languagesDir
) and a list of languages (languageNames
), as described in the Setup OCR Languages section.
// Create a reader instance
val scanReader = ScanReader(languagesDir, languageNames)
// Perform OCR
picture.read(scanReader)
// Retrieve the recognized text
val text: String = picture.scanText.toString()
Step 5: Save Results as a Searchable PDF
Prerequisites:
picture
from the Process the Image section.- TrueType font files supporting the necessary languages.
Important: To create and save a layered (sandwiched) PDF file with embedded searchable text above image layers, a set of TrueType font files (.ttf
) must be stored in an accessible directory.
DSSDK does not provide fonts. The demo application includes several royalty-free Google fonts. Additionally, any other fonts that support the necessary languages can be used.
- Define a list, e.g.,
fontList
, containing font file names. -
Set the desired image compression using predefined values:
val imageCompression = ImageWriterPdf.ImageCompression.*
Alternatively, specify the image compression ratio manually:
val imageCompression = ImageWriterPdf.ImageCompression(60.0F)
DSSDK provides five compression presets for images in PDFs: Lossless
, Low
, Medium
, High
, and Extreme
.
Note: Compression levels apply only to color and grayscale images. Black-and-white (bitonal) images always use highly efficient lossless compression.
// Obtain an ImageWriterPdf instance
ImageWriterPdf(fileName).use { writer ->
writer.setFontFiles(fontList)
writer.setImageCompression(imageCompression)
writer.write(picture)
}