Course: OCRopus

Page Rotation Correction

Page Rotation ("Skew") Correction

  • almost all document pages have text in a preferred orientation
    • text lines parallel to edge of paper
    • text border parallel/perpendicular with text lines and edges of paper
  • pages are often rotated slightly ("skewed") when scanned
  • page rotation detection determines the rotation angle and rotates the image back
  • why?
    • we can express lines and bounding boxes in terms of axis-aligned rectangles
    • some algorithms depend on it (few in OCRopus do)

"Traditional" Methods for Page Rotation Correction

  • projection methods
    • perform 1D projections at different angles, select the "best" profile
    • detects inter-line spacing; works worse for multi-column docs
  • morphological methods
    • perform morphological openings at different angles, pick the one that leaves the most stuff
    • detects inter-line spacing
  • connected component methods
    • generally speaking, look at the geometric relationships between nearby connected components
    • e.g., Docstrum, etc.
  • image and texture-based methods
    • generally speaking, look at the image as a whole
    • e.g., look for peaks in Fourier spectrum
  • cross-correlation between neighboring image strips
These methods are global and average results across a page.

Geometric Methods

  • above methods fail in various ways
    • presence of images
    • presence of parts of text from opposite pages
    • methods do not model ascenders/descenders
  • addressing these issues
    • use rotation-independent text line finding
      • note that "traditional" text line finding methods require page rotation correction
    • use the slopes of the individual text lines to estimate the page rotation
    • combine the slope estimates into a global estimate
      • e.g., averaging
      • detect multiple inconsistent orientation
  • text line finding
    • RANSAC (used in UW3)
    • RAST (used in OCRopus)

RAST Text Line Finding


  • properties
    • globally optimal solutions
    • text lines are found one-by-one, independent of context
    • descenders and ascenders are modeled independently
  • page rotation estimation
    • search for the best scoring text line
    • use this text line as the page rotation angle for the entire page
  • performance
    • accurate to within within-page variation of text line orientations
    • highly robust to context, noise
    • so good, we haven't bothered porting any of the other methods

Page Rotation Correction using RAST


image = bytearray()
result = bytearray()

corrector = make_DeskewPageByRAST()

read_image_gray(image,arg[1])
corrector:cleanup(result,image)
write_image_binary(arg[2],result)

Page Rotation Correction using RAST


 
 


Camera-Based Dewarping

A brief word about camera-based dewarping...
  • stereo-based methods
  • monocular, model-based methods
    • affine
    • curved
  • page boundary-based methods
    • affine
    • curved
  • structured light methods
  • shape-from-shading methods
  • shape-from-texture methods

Multiple Uses of RAST

  • input
    • grayscale page image
  • page rotation correction
    • perform binarization & connected component analysis
    • unconstrained RAST text line finding with large rotation angle
    • grayscale rotation to obtain corrected page
  • layout analysis
    • perform binarization & connected component analysis
    • constrained RAST text line finding with small rotation angle
    • compute segmentation mask
  • text line recognition
    • use segmentation mask and grayscale input image
    • extract masked text line images and pass on to recognizer




Navigation

Recent site activity