December 2007 Document Recognition Technology Overview Presentation

Post on 20-Oct-2014

1.710 views 2 download

Tags:

description

 

Transcript of December 2007 Document Recognition Technology Overview Presentation

Document Recognitiona technology overview

Presented by: Chris Riley of Artsyl Technologies, Inc.

But FirstYour new AIIM Board!

Exciting new eventsGolfNetworkingMore Education Sessions

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

Why Chris?

Who is Artsyl?

What qualifies Chris to talk to me?

When a developer turns to sales

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

Who knows what OCR is?

The TechnologiesOCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

The Technologies: OCROCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Ship To:

The Technologies: ICROCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Ilya

The Technologies: OMROCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Card Account

The Technologies: BarcodeOCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

1889094476620

The Technologies: HandwritingOCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

* Critical *

The Technologies: Acronym Heaven

OCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

The Technologies: CAR/LAROCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

2 hundred dollars & no cents

The Technologies: Assisted Capture

OCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

The Technologies: Fixed Form Processing

OCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Name: Ilya

Date: 12/21/2982

The Technologies: Fixed Form Processing

Name: IlyaDate: 12/21/2982

80% of business end-user documents are semi-structured

The Technologies: Semi-Structured Forms

OCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Invoice No: 99044

Date: 06/09/04

Invoice No: 24567

Date: 06/09/04

Invoice No: 99044Date: 06/09/04

Invoice No: 24567Date: 06/09/04 (06/09/2004)

The Technologies: Semi-Structured Forms

The Technologies: Semi-Structured Forms

OCR – Optical Character RecognitionICR – Intelligent Character RecognitionOMR – Optical Mark RecognitionBarcodeHandwritingAll the other ones made up for marketing purposes

CAR/LAR ( Check21 ) – Courtesy and Legal Amount RecognitionAssisted CaptureFixed Form ProcessSemi-Structured Forms ProcessingUnstructured Document Processing

Consignee

Consignor

Date

Term

The Technologies: Common Processes

Full page conversionClassificationIndex level extraction

RedactionRoutingAuto FilingRe-PurposingImage Rotation

The Technologies: Full page conversion

Image file to electronic data fileALL text on the pageIncludes:

Image Pre-processingDocument Analysis/ZoningExtractionExport ( Commonly PDF, DOC )

The Technologies: Classification

Software tells you the document typeScan batches of mixed documents

Bill of L

ading

Invoice

Check

PO

The Technologies: Index Level Extraction

Just certain required fields extractedNormalization of dataExport usually to a database

Invoice NumberInvoice Date

Total Amt DueTerm

The Technologies: How Accurate

Better question is how do you determine accuracy

Document Type AccuracyField/Zone Location AccuracyData Type AccuracyCharacter Accuracy

The Technologies: Common usage scenarios

Document Conversion

Document Archival / Retrieval

Invoice Processing

Insurance Processing( medical, mortgage )

Waybill processing

Survey processing

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

There Really are only 3 core technology providers

It takes 50 man-years to develop OCR using current computing abilities

Who Makes Them: Core Engines

ABBYYNuance ( formally ScanSoft )ReadI.R.I.S

OcéCharacTellParaScriptA2iA

Handful of Open SourceHandful of Other VendorsTwo handfuls of OLD engines

Who Makes Them: Who Licenses ThemEVERYONE ELSE!AnaCompAnydocBancTecBrainWareCaptarisCaptivationCardiffCVisionDataCapDigiTecheCopyEMC DocumentumKofaxLaserFicheLeadToolsMicrosoftNSi AutoStoreOnBasePerceptive ImagingReadSoftSERTop Image SystemsTowerWestbrookXerox

Hundreds More

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

30% of organizations that purchase, purchase the wrong thing

Over 50 % of organizations that purchase never use it properly

Buyer Beware

If OCR is the reason for buying a solution know what Engine it is!

Talk about the WHOLE solution not the pieces

Get past marketing gimmicks

Trust, Love, Be Certain of your reseller / vendor

Buyer Beware: Know your engine

What version?Will they upgrade?

Buyer Beware: Talk about Whole Solution

Scanner / InputCaptureStorage

Have Requirements List Before

Buyer Beware: Get past Gimmicks

NOTHING! Is 100%

All canned demos work perfect

Always see test on your documents

Version numbers are really arbitrary

Buyer Beware: Trust your vendor / reseller

Support after sale ( test them )

Where to get professional services

Do they understand the solution and not just the pieces?

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

The FutureFull-page OCR will be a commodity

Advance Document Processing will become main-stream but less required

Think about what to do now that you will be gathering data rapidly

There will be a new approach to OCR

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

Questions and Answers

Before you ask

What we will cover:Why Chris?

What Are the Document Recognition Technologies

Who Makes Them

Buyer Beware

The future

Q & A

Free Stuff!

Free Stuff

Copy of ABBYY FineReader Pro 9.0Copy of Nuance OmniPage 16Copy of ReadI.R.I.S Pro 11

4 Hour Consulting Session with ME!