Goal of Text-Image Segmentation- input is the raw image (grayscale, binary)
- eventually...
- want a segmentation into text blocks, non-text blocks, and background blocks
- however...
- there isn't enough information available at this early processing stage to make the correct decision
- approach
- low-level text/image segmentation outputs, for each pixel, the probability whether it is text, non-text, or background
- this information is then passed on to the layout analysis engine
Design Criteria- what units do we classify?
- connected components, individual pixels, patches, ...
- what features do we extract?
- texture features, geometric features, ...
- what classifier do we use?
- how do we integrate the local decisions?
- leave it up to subsequent processing stages, already perform some integration, ...
Current Implementations- Leptonica's code
- appears to be rule based, extensively tuned
- OCRopus block classifier
- use whitespace-based segmentation of the image into blocks
- for each block, extract a number of features
- classify using logistic regression
ITextImageSegmentation
image = bytearray() binarized = bytearray() result = intarray() binarizer = make_BinarizeBySauvola() ticlass = make_TextImageSegByLogReg() -- make_TextImageSegByLeptonica() read_image_gray(image,arg[1]) binarizer:binarize(binarized,image) ticlass:textImageProbabilities(result,binarized) write_image_packed(arg[2],result) Text/Image SegmentationText-Image Segmentation in OCRopus- Note:
- the methods don't work very well yet
- the methods are resolution dependent
- integration of the output into the rest of the layout analysis method is not complete yet
- fortunately... they aren't all that important because layout analysis implicitly gives good text/image segmentation
- To be done:
- we're implementing several additional methods
- consider contributing your own methods
|
|