OCR Interfaces
- key processing steps in OCRopus are available through simple C++ interfaces
- defined in colib/ocrinterfaces.h
- all you need in order to implement these is colib (a few header files)
- eventually, will permit dynamic loading of components
- eventually, will be able to implement these in Lua
OCR Interfaces
These are the standard "pluggable" component interfaces for OCRopus; if you implement one of these, your code should be usable directly in the OCRopus top-level loop:
- IComponent -- base class for all OCR interfaces
- ICleanupGray -- improve quality of gray scale images in some way
- ICleanupBinary -- improve quality of binary images in some way
- ITextImageClassification -- classify regions on a page into text and non-text
- IBinarize -- binarize grayscale images
- ISegmentPage -- segment a page into columns, lines, headers, footers, etc.
- ISegmentLine -- segment a line into character parts
- IGenericFst -- generic language model
- ICharacterClassifier -- isolated character classification
- IRecognizeLine -- text line recognition
IComponent
struct IComponent {
virtual const char *description() = 0;
virtual void set(const char *key,const char *value) ...
virtual void set(const char *key,double value) ...
virtual const char *gets(const char *key) ...
virtual double getd(const char *key) ...
virtual ~IComponent() {}
};
| - method to identify the component at runtime
- methods to set parameters by name
- methods to get parameters and statistics by name
- virtual destructor
- you don't need to implement any of these, except for description
- these can be called from Lua
|
Reducing Coupling
- reducing coupling is a key software engineering principle
- improved reuse
- easier testing
- we want the OCR system to be pluggable
- e.g., combine different text/image, layout, text recognition components
- therefore: pass as little information as possible between modules
- e.g., text line recognizer should work if it is given a readable input, regardless of resolution etc.
- we want the OCR system to require little tuning
- therefore: reduce the number of parameters for each module to the bare minimum
- try to estimate parameters dynamically from input data
IDocumentCleanup
struct ICleanupBinary : IComponent {
virtual void cleanup(bytearray &out,bytearray &in) = 0;
};
IDocumentCleanup Implementation
struct SampleCleanup : ICleanupBinary {
const char *description() {
return "SampleCleanup";
}
void cleanup(bytearray &out,bytearray &in) {
float frac = 1.5;
floatarray hist(100);
intarray runpeaks;
runlength_histogram(hist,in);
dshow1d(hist); dwait();
peaks(runpeaks,hist,2,10,3.0);
int threshold = 2;
if(runpeaks.length()>0) threshold = int(runpeaks[0] * frac);
intarray blobs;
blobs.copy(in);
label_components(blobs);
remove_small_components(blobs,threshold,threshold);
greater(blobs,0,0,255);
out.copy(blobs);
}
};
IDocumentCleanup Implementation (Parameters)
struct SampleCleanup : ICleanupBinary {
float frac;
SampleCleanup() {
frac = 1.5;
}
const char *description() {
return "SampleCleanup";
}
void set(const char *name,double value) {
if(!strcmp(name,"frac")) frac = value;
else throw "unknown parameter setting";
}
double getd(const char *name) {
if(!strcmp(name,"frac")) return frac;
else throw "unknown parameter setting";
}
void cleanup(bytearray &out,bytearray &in) {
...
}
};
IDocumentCleanup Implementation (Argument Checking)
struct SampleCleanup : ICleanupBinary {
...
void cleanup(bytearray &out,bytearray &in) {
optional_check_background_is_lighter(in);
CHECK_ARG(contains_only(in,0,255));
...
}
};