Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France.

Post on 22-Dec-2015

214 views 0 download

Tags:

Transcript of Graphics Recognition – from Re-engineering to Retrieval Karl Tombre, Bart Lamiroy LORIA, France.

Graphics Recognition – from Re-engineering to RetrievalKarl Tombre, Bart Lamiroy

LORIA, France

Document Analysis in the IR era

Information is at the core of industrial strategies

A lot of digital or digitized information, but often in very “poor” formats

The challenge: not necessarily re-engineering of documents, but enrich poorly structured information, add (limited) amount of semantics, build indexes

Purposes: browsing, navigation, indexing DAR methods and tools useful, but must

be adapted

Specific challenges of large-scale IR applications Genericity: we cannot necessarily build a

complete and exhaustive a priori model of contextual knowledge (ontology)

Adaptability: various input data – scanned paper, PDF, DXF, HTML, GIF… – various resolutions

Robustness: “back-office” applications Efficiency: online searching in

heterogeneous data Scaling: methods have to scale to

increasing number of symbols/features

DAR and IR

Media without (or with very little) contextual knowledge

Image-based indexing and retrieval, indexing of video sequences

Documents do explicitly convey information from one person to another person

Much more structure, syntax and semantics

DAR and IR – some examples

Indexing and/or searching scanned text without OCR

Similarities, signatures Query or index on layout structure Table spotting Keyword spotting …

What about Graphics Recognition? Subfield of DAR, for graphics-rich

documents Numerous methods for various analysis

and recognition problems Raster-to-vector conversion Text/graphics separation Symbol recognition

Many specific technical areas: maps, architectural drawings, engineering drawings, diagrams and schematics, …

Graphics recognition methods Text/graphics separation

Vectorization

Graphics recognition methods

Graphics recognition and IR applications Usual text-based indexing and retrieval

still useful But need for access to other kinds of

information: Symbols Text-drawing connections Description-illustration connections

Some contributions Syeda-Mahmood – maintenance drawings

IEEE Trans. On PAMI 21(8):737-751, Aug. 1999

Some contributions Arias et al., Najman et al. – use of information

contained in legend / title block

Proc. GREC’01, Kingston (Ontario, Canada), p.19-26, Sept. 2001

Some contributions Samet & Soffer – symbols from legend

IEEE Trans. On PAMI 18(8):783-798, Aug. 1996

Some contributions Müller & Rigoll – graphical retrieval in database

of engineering drawings

Proc. ICDAR’99, Bangalore (India), pp. 697-700, Sept. 1999

Some contributions Boose et al. (Boeing) – Generation of Layered

Illustrated Parts Drawings (GREC’ 03)

Proc. GREC’03, Barcelona, pp. 139-144

Wishful thinking?

Symbol DB

Or even better…

Symbol recognition Natural features for indexing and retrieval Most methods work with known databases

of reference symbols – what about interactive querying of arbitrary symbols?

From segmentation followed by recognition, to segmentation-free recognition, or segmenting while recognizing

Scalability Efficiency / complexity Discrimination power

Signatures

Before we move on:

1st contest on

symbol recognition

held last week

See IAPR TC10 homepage

for further details

Image-based signatures

Compute invariant signatures on binary document image F-signatures (ICDAR’01) Radon transform: R-signatures [Tabbone

& Wendling] Ridgelets [Ramos Terrades & Valveny –

GREC’03] – aka wavelet transform of Radon transform

R-signaturesDetection of arrowheads [Girardeau & Tabbone]

DEA degree thesis, INPL, Nancy, Jul. 2002

R-signaturesAnother example [Girardeau & Tabbone]

Ridgelets[Ramos Terrades & Valveny – GREC’03]

Proc. GREC’03, Barcelona,

pp. 202-211

Vector-based signatures

[Dosch & Lladós – GREC’03] Based on set of basic graphical features:

Parallelism Overlap Collinearity T- and V-junctions

Quality factor associated with the various relations

Match signatures of reference symbols with signatures of buckets

Vector-based signatures

Proc. GREC’03,

Barcelona,

pp. 159-169

Towards symbol spotting

Pre-compute – or compute on the spot – a set of basic signatures

Can be sufficient for symbol spotting and retrieval

Followed by classical symbol recognition if more discrimination is needed

Symbol spotting [Jabari & Tabbone] : graph matching through

probabilistic relaxation, with nodes=segments and vertices=relations

DEA degree thesis, INPL, Nancy, Jul. 2003

Symbol spotting [Jabari & Tabbone] : another example

Combining Text and Graphics

Extracting Text/Graphics relationships within document

Using Text matching for inter-document relationships

Transitive inter-document Graphics matching

No need for complex graphics matching Restricted to well known document types

Example: continuation of Wiring Diagrams (Boeing) [Baum et al. – GREC’03]

Proc. GREC’03, Barcelona, pp. 132-138

Scan2XML Example

Proc. GREC’01, Kingston (Ontario, Canada), pp. 312-325

Indexing and Semantics

Signature + metric Semantics = measured distance to signature Applies only to homogenous contexts

Pre-segmented images Pre-determined image classes Implicit application of domain kowledge ...

Semantics = Syntax

Example

Signature type AMetric M

Semantics1 = (1, 1)Semantics2 = (, 2)

Signature value M(M(

semantics = measurement to reference value

Heterogenous Document Bases Semantics do not have a unique syntax

anymore Syntax metrics may be context sensitive Semantics = Syntax + Context

Context needs to be considered

Two different contexts from the automobile industry

Example

Context 1:Signature type AMetric M

(1, 1) = Semantics1 = (1, 1) (, 2) = Semantics2 = (, 2)

Context 2:Signature type BMetric N

Signature value What if

M( and N(

A step to taking into account context(while consolidating existing approaches)

Component Algebra : Image Analysis = Pipeline Syntax + algorithm = semantics

AlgorithmAlgorithmDataData

(syntax)

DataData

(semantics)

AlgorithmAlgorithmDataData

(semantics)

Syntax and semantics need not be distinguished

Component Algebra

Components :Known and implemented document analysis

algorithms, taking input data from one domain, and producing data into another domain.

Application Context :Set of all available Components.

Semantics :Data sets needed by or produced by Components.

Component Algebra is a Graph

ComponentComponentDataData

DataData

ComponentComponent

DataDataDataData

DataData DataData

DataData

ComponentComponent

Advantages

Each node is a semantic concept, semantic relationships are explicitly expressed.

Structure may support automatic reasoning and knowledge inference.

Context is embedded in components, different contexts give different paths in the graph.

Highly scalable and open architecture. Bridge between signal-level document

analysis and high-level document representation.

However ...

The formalism exists, the realization doesn't (yet)

What about parametrization ? How context independant can you get ? What about « guessing » context

appropriateness ? How to design fully interoperable components ?

Conclusion A lot of DA methods – and more specifically

GR methods – can be of direct use in IR, indexing and browsing applications

Specific challenges Scaling and efficiency Heterogeneous sets of documents Incomplete domain knowledge Symbol spotting On-the-fly symbol searching

Sketch of open framework for including document semantics when context can be heterogeneous