Course: OCRopus

OCR Interfaces

OCR Interfaces

  • key processing steps in OCRopus are available through simple C++ interfaces
    • defined in colib/ocrinterfaces.h
    • all you need in order to implement these is colib (a few header files)
    • eventually, will permit dynamic loading of components
    • eventually, will be able to implement these in Lua

OCR Interfaces

These are the standard "pluggable" component interfaces for OCRopus; if you implement one of these, your code should be usable directly in the OCRopus top-level loop:
  • IComponent -- base class for all OCR interfaces
  • ICleanupGray -- improve quality of gray scale images in some way
  • ICleanupBinary -- improve quality of binary images in some way
  • ITextImageClassification -- classify regions on a page into text and non-text
  • IBinarize -- binarize grayscale images
  • ISegmentPage -- segment a page into columns, lines, headers, footers, etc.
  • ISegmentLine -- segment a line into character parts
  • IGenericFst -- generic language model
  • ICharacterClassifier -- isolated character classification
  • IRecognizeLine -- text line recognition

IComponent

    struct IComponent {
        virtual const char *description() = 0;

        virtual void set(const char *key,const char *value) ...
        virtual void set(const char *key,double value) ...

        virtual const char *gets(const char *key) ...
        virtual double getd(const char *key) ...

        virtual ~IComponent() {}
    };
                                
 
  • method to identify the component at runtime
  • methods to set parameters by name
  • methods to get parameters and statistics by name
  • virtual destructor
  • you don't need to implement any of these, except for description
  • these can be called from Lua

Reducing Coupling

  • reducing coupling is a key software engineering principle
    • improved reuse
    • easier testing
  • we want the OCR system to be pluggable
    • e.g., combine different text/image, layout, text recognition components
    • therefore: pass as little information as possible between modules
    • e.g., text line recognizer should work if it is given a readable input, regardless of resolution etc.
  • we want the OCR system to require little tuning
    • therefore: reduce the number of parameters for each module to the bare minimum
    • try to estimate parameters dynamically from input data

IDocumentCleanup


    struct ICleanupBinary : IComponent {
        virtual void cleanup(bytearray &out,bytearray &in) = 0;
    };

IDocumentCleanup Implementation

struct SampleCleanup : ICleanupBinary {
    const char *description() {
        return "SampleCleanup";
    }
    void cleanup(bytearray &out,bytearray &in) {
        float frac = 1.5;
        floatarray hist(100);
        intarray runpeaks;
        runlength_histogram(hist,in);
        dshow1d(hist); dwait();
        peaks(runpeaks,hist,2,10,3.0);
        int threshold = 2;
        if(runpeaks.length()>0) threshold = int(runpeaks[0] * frac);
        intarray blobs;
        blobs.copy(in);
        label_components(blobs);
        remove_small_components(blobs,threshold,threshold);
        greater(blobs,0,0,255);
        out.copy(blobs);
    }
};

IDocumentCleanup Implementation (Parameters)

struct SampleCleanup : ICleanupBinary {
    float frac;
    SampleCleanup() {
        frac = 1.5;
    }
    const char *description() {
        return "SampleCleanup";
    }
    void set(const char *name,double value) {
        if(!strcmp(name,"frac")) frac = value;
        else throw "unknown parameter setting";
    }
    double getd(const char *name) {
        if(!strcmp(name,"frac")) return frac;
        else throw "unknown parameter setting";
    }
    void cleanup(bytearray &out,bytearray &in) {
        ...
    }
};

IDocumentCleanup Implementation (Argument Checking)

struct SampleCleanup : ICleanupBinary {
    ...
    void cleanup(bytearray &out,bytearray &in) {
        optional_check_background_is_lighter(in);
        CHECK_ARG(contains_only(in,0,255));
        ...
    }
};



Navigation

Recent site activity