Home Pagewww.ocropus.org - documentation, source code
- project pages
- list of contributors
- list of resources
OCRopus License
OCRopus is distributed under the Apache 2 license - very widely used license
- similar to BSD license
- far more liberal than GPL/LGPL
- use it for open source project
- use it for closed source products
- if you distribute source, you need to acknowledge its source
- if you assert patents against other users, you can't use it anymore
- see the license for details
Target Applications- document indexing and search
- on-line reading
- re-typesetting
- editing
- information extraction
State of OCR- OCR is lots of different things...
- physical layout analysis
- logical layout analysis
- text recognition
- font recognition, style recognition
- math recognition
- table recognition
- all of these problems are unsolved for unrestricted input
- what works fairly well...
- physical layout analysis for simple layouts
- text recognition for alphabetic languages and clean inputs
- OCR is harder than speech recognition
- expectation of 100% accuracy
- human-designed medium, rather than natural medium
State of Commercial OCR State of Commercial OCR
 State of Commercial OCR
 State of Commercial OCR
Commercial OCR Use Cases- primary use cases
- desktop scanning, some bill and mail processing
- layout analysis
- rule-based, not trainable, prone to catastrophic failures
- character recognition
- ad-hoc classifier combination, speed-oriented
- language modeling
- dictionary lookup, backtracking
- adaptation
- per-page “retraining”, dictionary augmentation
OCR Tradeoffs- throughput and memory requirements
- average error rate
- frequency of catastrophic failures
- types of errors
- layout analysis errors
- OCR errors correctable by language models
- OCR errors on names / identifiers
- cost
OCRopus Architecture OCRopus Goals- performance
- significantly reduce average character error rate
- greatly reduce undetected catastrophic failures
- meet production throughput/memory requirements
- functionality
- any script, any language
- pluggable architecture
- testable architecture
- fully statistical foundation
|
|