Page Rotation ("Skew") Correction- almost all document pages have text in a preferred orientation
- text lines parallel to edge of paper
- text border parallel/perpendicular with text lines and edges of paper
- pages are often rotated slightly ("skewed") when scanned
- page rotation detection determines the rotation angle and rotates the image back
- why?
- we can express lines and bounding boxes in terms of axis-aligned rectangles
- some algorithms depend on it (few in OCRopus do)
"Traditional" Methods for Page Rotation Correction- projection methods
- perform 1D projections at different angles, select the "best" profile
- detects inter-line spacing; works worse for multi-column docs
- morphological methods
- perform morphological openings at different angles, pick the one that leaves the most stuff
- detects inter-line spacing
- connected component methods
- generally speaking, look at the geometric relationships between nearby connected components
- e.g., Docstrum, etc.
- image and texture-based methods
- generally speaking, look at the image as a whole
- e.g., look for peaks in Fourier spectrum
- cross-correlation between neighboring image strips
These methods are global and average results across a page. Geometric Methods- above methods fail in various ways
- presence of images
- presence of parts of text from opposite pages
- methods do not model ascenders/descenders
- addressing these issues
- use rotation-independent text line finding
- note that "traditional" text line finding methods require page rotation correction
- use the slopes of the individual text lines to estimate the page rotation
- combine the slope estimates into a global estimate
- e.g., averaging
- detect multiple inconsistent orientation
- text line finding
- RANSAC (used in UW3)
- RAST (used in OCRopus)
RAST Text Line Finding - properties
- globally optimal solutions
- text lines are found one-by-one, independent of context
- descenders and ascenders are modeled independently
- page rotation estimation
- search for the best scoring text line
- use this text line as the page rotation angle for the entire page
- performance
- accurate to within within-page variation of text line orientations
- highly robust to context, noise
- so good, we haven't bothered porting any of the other methods
Page Rotation Correction using RASTimage = bytearray() result = bytearray() corrector = make_DeskewPageByRAST() read_image_gray(image,arg[1]) corrector:cleanup(result,image) write_image_binary(arg[2],result) Page Rotation Correction using RASTCamera-Based DewarpingA brief word about camera-based dewarping... - stereo-based methods
- monocular, model-based methods
- page boundary-based methods
- structured light methods
- shape-from-shading methods
- shape-from-texture methods
Multiple Uses of RAST- input
- page rotation correction
- perform binarization & connected component analysis
- unconstrained RAST text line finding with large rotation angle
- grayscale rotation to obtain corrected page
- layout analysis
- perform binarization & connected component analysis
- constrained RAST text line finding with small rotation angle
- compute segmentation mask
- text line recognition
- use segmentation mask and grayscale input image
- extract masked text line images and pass on to recognizer
|
|