Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost,...
-
Upload
jewel-stokes -
Category
Documents
-
view
212 -
download
0
Transcript of Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost,...
![Page 1: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/1.jpg)
Optical Character Recognition for Logistics Reporting
Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam
A recording of the WebEx session can be found here: https
://jsi.webex.com/jsi/lsr.php?AT=pb&SP=MC&rID=75382732&rKey=f3bc9ca3232b8b42
![Page 2: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/2.jpg)
Testing Methodology
Select Tools
Collect FormsPerform & Document
![Page 3: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/3.jpg)
OCR Tools
• OmniPage Professional 18 (desktop-based, licensed)• Abbyy FineReader 11 (desktop-based, licensed)• Tesseract-OCR (desktop-based, open-source)• Evernote (mobile phone–based, free)• Captricity (web-based, paid)
![Page 4: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/4.jpg)
Testing Protocol
1. Pass field-filled logistics management information system (LMIS) form through application
2. Fill out blank LMIS form carefully and pass through
3. Record number of correctly vs. incorrectly identified fields (numeric)
4. Calculate character recognition accuracy rates.
![Page 5: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/5.jpg)
Form 1: Tanzania Essential Medicines R&R
![Page 6: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/6.jpg)
Form 2: Tanzania Essential Medicines Supplementary Form
![Page 7: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/7.jpg)
Form 3: Zimbabwe ARV R&R
![Page 8: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/8.jpg)
OmniPage Professional 18
• Licensed tool—$499.99• General impressions:
– Easy to use after initial orientation – Fast processing (less than 1 minute)– Can verify/validate recognized text
![Page 9: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/9.jpg)
Interface
![Page 10: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/10.jpg)
Wizara ya Afya na Ustawi wa Jam ii Integrated Logistics SystemFOMU 2C: FOMU TUPU YA TAAR I FA NA MAOMBI (R&R) TA DAWA/VIFA A VYA TIBA VYA ZIA DA N° 388151
MALI YA ZIADANamba ya MSD Mali Kip Wadi ya Kilichopokele we kipindi hikiUPotevu, Wadi ya Makadirlo ya. Kia51! ,Kiasi Kilichoagi Bei Gharama (GxH))(Iasi Gharama
' I'm cha UgaviKuanzla (B) MarekebishoMwisho Matumizi Kinacholutajika(G) (H) (i) Kilicho iltyo(A) (C) (D) [A+B+C.D] (E.3)x7-D] idhinishwaidhinishwa
1E) (F) (J) (K)10 l0 ‘1) CD(ACt tA -t.DC. 9CA., C) e) O a ii. s--G, a-A ;71C.AD. .
3 A-bat-kl..-710101 D 3,s- "`2„tou_CA IA -Q4Pc- Cx-k--i L...bC..) O 0 C c 2 kl- 6-6) Rr c:ND r)C93"1:'' V triffrivcpi 0 0 0 0 0 kr.C1 q2:: 9 oo 0 .go g . c 0 .y...,,, I, Lz.1 . .,...,..2)to io to ^../1.1 IN ryll p_S lel I CLANkl_r-3-& CD 0 0 C 24 S—LA 0.3 Zii IDOttVi-ID to L 0 ._62 C_./....L 4.krv. t. r- cCD 0 0 0 0 \ g6" z 20 t Deo '3)
L:7^ ^_c,r...3lootO \_)3I t VI. -f.) C c C c) i q s („) 3 4-,,Dot ;ha_ _r-tt-:%,t, , /}
e,-co„),:2_,_ks,_,,--4:to 01,, v., t-__Ty,k Lu.st (A-) C 0 C.) 0 1 g, `7 („ .____, k-i-I C'o •
4
_...,
Zahanati au kituo cha afya kutuma kwa DMO nakala ya juu na ya kali. Tunza nakala ya chini. Vihlaya kutuma MSD nakala ya juu. Tunza na nakala ya kati. Tupa nakala ya chiniJumlaGharama: Jumla Wyo. :'.... --`,:.-13 g2CADidhinishwa:
Hospitali kutuma MSO nakala ya 1:111 Tunza nakala yake
Output
![Page 11: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/11.jpg)
OmniPage Professional 18
Accuracy rates (numerical fields):•Forms filled out in the field:
– TZ essential medicines: 13%– TZ supplementary form: 21%
•Forms filled out by tester:– TZ essential medicines: 53%– TZ supplementary form: 76%
![Page 12: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/12.jpg)
Abbyy FineReader 11
• Licensed tool—$169.99• General impressions:
– Easy to use after initial orientation—harder to learn to use than OmniPage
– Fast processing (1–3 mins)– Can verify/validate recognized text
![Page 13: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/13.jpg)
Interface
![Page 14: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/14.jpg)
Wizara ya Afya na Ustawi wa JamiiIntegrated Logistics SystemFOMU 2C: FOMU TUPU YATAARIFA NA MAOMBI (R&R) YA DAWA/VlFAA VYA TlBA VYA ZlADAN° 088151Zahanan aj kituo cha afya kutuma kwa Df/O naka*a ya juu na ya kali Tunza nakala ya chini. Wtlaya kutima MSD nakala ya juu. Tunza na nakala ya kali Hospitali kutuma MSD nakala ya juu Tunza nakala yakeTupa nakaia ya chiniMALI YAZIADAMamba Mali Kipimoch Idadi ya Kilichopo Upotevu/ Idadi ya Makadiri Kiasi Kiasi Bei Gharama Kiasi Gharama
icvoro2Jo
CouQv\-OcV^/Vt^uuT 0 o 6 £ o £Tk s> SH/ioo •
IDIDID Cou-Ct v\ D ft D o o Q ^ 3 33>rtoo 3
laowr<?o
V iTfvTvwibB o o O O o toe. 22)?» 2> ^OGO 0.6
.ojmo
vj Cv oa\ (OPi - CDmu^Se^ D o o o o 2 if Sb. 3 Sfr.OOt ■6CAl-Av^v O o o o o 3> 2adOC . 3
UjtoiO PlCKb . c? a o o o 3><^ 3 2^001: 'jtDtolo ^fecbo^- r- o o o o \ Q. 3> ^ 3 ^&oo 3
•
JumlaGharama: 13^,20
Jumla iliyo-^dhinishwa: *•
Output
![Page 15: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/15.jpg)
Abbyy FineReader 11
Accuracy rates (numerical fields):•Forms filled out in the field:
– TZ essential medicines: 10%– TZ supplementary form: 10%
•Forms filled out by tester:– TZ essential medicines: 39%– TZ supplementary form: 43%.
![Page 16: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/16.jpg)
Tesseract-OCR
• Open-source tool• General impressions:
– Does not have a graphical user interface– Is a command line tool—needs to be run from command line– Difficult for users who do not know command line use– Requires input file in image format (i.e., .png, .jpg)
![Page 17: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/17.jpg)
Tesseract-OCR
• In the example below, we ran Tesseract with a scanned image file and an output file to hold the recognized text:
![Page 18: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/18.jpg)
Program install location
Program name Scanned imageOutput text file name
Interface
![Page 19: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/19.jpg)
Source File
![Page 20: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/20.jpg)
Output
![Page 21: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/21.jpg)
Evernote
• Can send pictures of documents• Not useful for character recognition or data entry• Allows tagging on the image, e.g., district/facility
![Page 22: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/22.jpg)
Captricity
• Web-based, paid service• Offers several tiers of pricing:
– “Pay as you go”—$0.01 per field– Discounts as number of fields increase– “Premier” tier—$335/month for 50,000 fields
• $0.0067 per field
– “Enterprise” tier—custom tier, depending on volume• provides dedicated account manager and support
• volume discounts.
![Page 23: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/23.jpg)
Captricity
Process:
1.User creates template for form
2.System creates digital fingerprint from template
3.Compares uploaded form to digital fingerprint– Fixes skews, or flips form, if needed
4.Does human validation field-by-field– never see the entire form– preserves privacy
5.Output in .csv file.
![Page 24: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/24.jpg)
Captricity
General impressions:•Initially, time intensive
– must separate forms into single files, per page– must set up templates for each page, e.g., one page form
took 10 minutes to create
•Requires Internet connection•Approximately 24-hour turnaround for first time
– turnaround time is reduced after first processing.
![Page 25: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/25.jpg)
Interface
![Page 26: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/26.jpg)
Output
![Page 27: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/27.jpg)
Captricity:
Accuracy rates (numerical fields)•Forms filled out in the field:
– TZ essential medicines: 65%– TZ supplementary form: 99%– Zim antiretrovirals: 52%
•Forms filled out by tester:– TZ essential medicines: 98%– TZ supplementary form: 100%– Zim antiretrovirals: 98%
![Page 28: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/28.jpg)
Research conclusion: Captricity looks most promising
Digging deeper…
![Page 29: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/29.jpg)
Captricity Positives
• Shows best results– Validation of output is critical
• Fast turnaround time • Digitization is accurate
– data entry staff did not introduce new errors
• Cloud storage can store data indefinitely• Output in .csv format (readable by a database).
![Page 30: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/30.jpg)
Captricity Negatives
• Requires Internet connection; must be used at higher levels of supply chain
• Set up is time-intensive; must— – split up forms– create templates– rotate to landscape
• Validation/reconciliation can be time consuming• Cost can be high, but discounts available for high
volume– Cheaper than hiring data entry clerks?
![Page 31: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/31.jpg)
Use Cases for LMIS Reporting Using Captricity
![Page 32: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/32.jpg)
Use Case 1
SDP/CHW: Send paper report
District: Upload and verify
Central database
![Page 33: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/33.jpg)
Use Case 2
SDP/CHW: Take photo of form
District: Upload and verify
Central database
![Page 34: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/34.jpg)
Use Case 3
SDP/CHW: Send paper report
District: Aggregate
reports
Central databaseCentral: Upload and verify
![Page 35: Optical Character Recognition for Logistics Reporting Contributors: Joy Kamunyori, Mike Frost, Ashraf Islam A recording of the WebEx session can be found.](https://reader036.fdocuments.in/reader036/viewer/2022081603/56649da85503460f94a943d9/html5/thumbnails/35.jpg)
Thank You! Questions?