Zonal OCR Technology, Bounding Boxes, X/Y-coordinate

This is a big request. Hope it’s possible.

The workflow is the same like ImageSelection, but flexible to choose GetTextSelection or ClickSelection, for example.


Here is a summary of zonal OCR technology and bounding boxes:

Zonal OCR is a technique that allows extracting text from specific regions or zones of a document or image. It works by defining bounding boxes around the areas of interest. Some key points about zonal OCR:

  • Bounding boxes are created to mark the regions from which text needs to be extracted. This allows targeting only relevant sections instead of OCRing the entire document.[1][3][10]

  • The bounding box coordinates are specified as (x, y) positions of the top-left and bottom-right corners. This defines a rectangular zone.[4][10]

  • Once the zones are marked with bounding boxes, OCR is performed only within those regions to extract the text.[2][3]

  • Zonal OCR enables extracting specific data fields like names, addresses, totals etc. from structured documents like forms and invoices.[4][15]

  • It is more efficient than OCRing the full document as it ignores irrelevant elements outside the marked zones.[2][5]

  • Defining the right zones and bounding boxes is critical for accuracy. This can be done manually or using automated tools.[1][10]

  • Zonal OCR, when combined with AI/ML techniques, can significantly improve data extraction accuracy and efficiency compared to traditional full-page OCR.[6][8]

In summary, zonal OCR uses bounding boxes to define regions of interest within a document, and then applies OCR only within those zones to extract targeted text data. This technology is very useful for extracting specific structured information from complex documents.

Citations:
[1] How to extract relevant information from receipt - Stack Overflow opencv - How to extract relevant information from receipt - Stack Overflow
[2] Things You Must Know About The Zonal OCR Technology Things You Must Know About The Zonal OCR Technology - Klearstack
[3] Machine Learning OCR - Intelligent Text Detection 2.0 - Klippa Machine Learning OCR - Intelligent Text Detection 2.0
[4] OCR a document, form, or invoice with Tesseract, OpenCV, and … OCR a document, form, or invoice with Tesseract, OpenCV, and Python - PyImageSearch
[5] How to OCR with Tesseract in Python with Pytesseract and OpenCV? How to OCR with Tesseract in Python with Pytesseract and OpenCV?
[6] 7 incredibili strumenti AI OCR PDF - Wondershare PDFelement 7 incredibili strumenti AI OCR PDF
[7] Keywords of Python packages on PyPI - PyDigger https://pydigger.com/keywords
[8] What is OCR (Optical Character Recognition)? - Parseur What is OCR (Optical Character Recognition)? | Parseur®
[9] 11958297 files 8600432 settings 8347444 us 5796345 in 5557369 https://faculty.nps.edu/ncrowe/coursematerials/english_single_word_freqs.txt
[10] Using Zones in ML Functional Service OCR API - SAP Community Using Zones in ML Functional Service OCR API - sta... - SAP Community
[11] 2010-mefi-freq-10plus.txt - MetaFilter Stuff https://stuff.metafilter.com/corpus/freq/misc/2010-mefi-freq-10plus.txt
[12] Telugu Handwritten Isolated Characters Recognition using Two … Telugu Handwritten Isolated Characters Recognition using Two Dimensional Fast Fourier Transform and Support Vector Machine | Semantic Scholar
[13] [PDF] A System for Offline Recognition of Handwritten Characters in … [PDF] A System for Offline Recognition of Handwritten Characters in Malayalam Script | Semantic Scholar
[14] Avis sur Docparser ? : r/dataengineering - Reddit https://www.reddit.com/r/dataengineering/comments/15dr3pv/docparser_reviews/?tl=fr
[15] How to extract structured data from invoices - LinkedIn How to extract structured data from invoices
[16] High-speed Scanning : OCR, capture, sort and index documents https://www.youtube.com/watch?v=4n7bz2LgWiM

1 Like