Workflow Guide Pixelnetica™ Document Scanning SDK for Android

Name: Pixelnetica™ Document Scanning SDK
Brand: Pixelnetica
SKU: DSSDK
Availability: InStock
Rating: 5 (3 reviews)

Below is the standard workflow for processing images with Pixelnetica Document Scanning SDK (DSSDK).

This guide provides instructions on integrating the Document Scanning SDK (DSSDK) into your application. Refer to the bundled sample application source code for a practical implementation.

Step 1: Open an Image and Detect Document Bounds

Prerequisites: An imageUri obtained from the Image Picker, Gallery, or other local storage sources.
Important: Do not use an internet URL!

// Create and configure ScanPicture
val picture = ScanPicture(context, imageUri)
picture.shadows = true   // Enable shadows if needed

// Detect document corners
val cutout = picture.detectCutout()
if (!cutout.isDefined) {
  // In cases where document borders cannot be determined,
  // consider displaying a warning to the user.
}

Step 2: Automatically Detect Picture Orientation

Prerequisites: A language directory path (languagesDir) as described in the Setup OCR Languages section.

// Create a detector instance
val orientationDetector = ScanDetector(languagesDir)

// Determine the picture's orientation
picture.detectOrientation(orientationDetector)

Step 3: Process the Image

Execute the refine pipeline to binarize (apply the desired color profile), crop, and rotate the image to the correct display orientation.

picture.refine(
  listOf(
    RefineFeature.Profile(RefineFeature.Profile.Type.Bitonal), // Perform black-and-white (bitonal) processing
    RefineMode.Rectify.WithCutout(cutout),  // Apply the detected cutout
    RefineMode.Display() // Rotate the picture to the correct visual orientation
  )
)

// Obtain the processed image
val bitmap: Bitmap = picture.createBitmap()

Step 4: Recognize Text in the Image

Prerequisites: A language directory path (languagesDir) and a list of languages (languageNames), as described in the Setup OCR Languages section.

// Create a reader instance
val scanReader = ScanReader(languagesDir, languageNames)

// Perform OCR
picture.read(scanReader)

// Retrieve the recognized text
val text: String = picture.scanText.toString()

Step 5: Save Results as a Searchable PDF

Prerequisites:

picture from the Process the Image section.
TrueType font files supporting the necessary languages.

Important: To create and save a layered (sandwiched) PDF file with embedded searchable text above image layers, a set of TrueType font files (.ttf) must be stored in an accessible directory.
DSSDK does not provide fonts. The demo application includes several royalty-free Google fonts. Additionally, any other fonts that support the necessary languages can be used.

Define a list, e.g., fontList, containing font file names.

Set the desired image compression using predefined values:

val imageCompression = ImageWriterPdf.ImageCompression.*

Alternatively, specify the image compression ratio manually:

val imageCompression = ImageWriterPdf.ImageCompression(60.0F)

DSSDK provides five compression presets for images in PDFs: Lossless, Low, Medium, High, and Extreme.

Note: Compression levels apply only to color and grayscale images. Black-and-white (bitonal) images always use highly efficient lossless compression.

// Obtain an ImageWriterPdf instance
ImageWriterPdf(fileName).use { writer ->
   writer.setFontFiles(fontList)
   writer.setImageCompression(imageCompression)
   writer.write(picture)
}