Structured Text Retrieval Models. Str. Text Retrieval Text Retrieval retrieves documents based on...

Structured Text Retrieval Models

Str. Text Retrieval Text Retrieval retrieves documents based on index

terms. Observation: Documents have implicit structure. Regular text retrieval and indexing strategies lose

the information available within the structure. Text Retrieval desired based on structure.

e.g. All documents having “George Bush” in the caption of a photo.

Models for Str. Text Retrieval

PAT Expressions Overlapped Lists Proximal Nodes List of References Tree-based Query Languages (SFQL,CCL)

Proximal Nodes

By Gonzalo Navarro and Ricardo Baeza-Yates Based on hierarchical structure of documents Structure computation is static and all

structural elements are defined. “nodes” Model attempts to define operators on these

nodes based on their definition and content. Only nodes at a particular hierarchy are

returned as results.

Proximal Nodes

Document

Chapter Chapter

Section Section Section

Proximal Nodes Nodes are structural in nature, e.g. Chapter,

Section, etc. Each node has a defined segment

(Contiguous part of text) Operators are defined with respect to this

model. Structure operators and Text operators.

Proximal Nodes Structure Operators

Name Inclusion Positional Inclusion Distance operators Child/Parent operators Set Manipulation operators

Text Operators Match

Retrieval on Evidence By Mounia Lalmas Based on documents made up of objects. Objects are modeled as independent entities and

can be in different media, language or locations. Document indexing – degree of uncertainty that the

index term actually represents the object. Uncertainty must be captured to get better results. Use the Dempster-Shafer theory of evidence

Retrieval on Evidence Model takes into consideration disparity

between indexing vocabularies. Aggregation of indexing vocabulary and also

the aggregation of the uncertainty.Object o Є O and a type t Є T, the function type

is defined as O →∂(T)Aggregation is defined over objects and

composite object types contain all the types of the contained objects

Retrieval on Evidence Indexing vocabulary is defined over a

proposition-space. e.g. Wine (english,text), Blue(colour,feature)

Sentence space defines that indexes in the same proposition space can be used together.

Semantic between indexing vocabulary is maintained using the the notion of worlds.

Retrieval on Evidence Each type t has S, W, v, π St is the sentence space for a type W is the possible worlds associated with St

vt is {true, false} over Wt x Pt

Πt is {true, false} over Wt x St

Logical and equivalence between sentences is built around the notion of their semantics being equivalent in all or most worlds.

Retrieval on Evidence However, the uncertainty of the

representation remains. This is represented by the weighting function

based on the Dempster Shafer model. These objects and their syntactic and

semantic models are aggregated for the objects which contain them. E.g. A section containing sentences indexed by terms a,b,c,d.. Will be equivalent to sentences over the worlds also implying a,b,c,d…

Comparisons Proximal Nodes is based on structured

documents. It presents the matter clearly and provides approaches towards building a software architecture. It presents findings of conducted experiments.

The Evidence paper tries to model heterogeneous documents, made up of different media, languages, etc. Overall the model is complex and no results are given to its implementation and performance.

Structured Text Retrieval Models. Str. Text Retrieval Text Retrieval retrieves documents based on...

Documents

Transcript of Structured Text Retrieval Models. Str. Text Retrieval Text Retrieval retrieves documents based on...

Text REtrieval Conference (TREC) Home Page

CS276A Text Retrieval and Mining

Using Text Embeddings for Information Retrieval

Risk Minimization and Language Modeling in Text …czhai/thesis.pdf · Risk Minimization and Language Modeling in Text ... information retrieval, text retrieval, risk minimization

Information Retrieval Beyond the Text Document

Machine transliteration and transliterated text retrieval ...

Business Information Systemsthomas.deselaers.de/.../03_textBasedRetrieval.pdf · • Text retrieval is the basis of image retrieval – Many techniques come from this domain • Text

PARALLEL TEXT RETRIEVAL ON TEMPORALLY VERSIONED DOCUMENT ... · PARALLEL TEXT RETRIEVAL ON TEMPORALLY VERSIONED DOCUMENT COLLECTIONS a thesis ... PARALLEL TEXT RETRIEVAL ON TEMPORALLY

Www.monash.edu.au CSE3201/CSE4500 Information Retrieval Systems Signature Based Text Retrieval Systems.

A Brief Survey on Cross-language Information Retrieval (CLIR) - Text Retrieval Perspective

1 IFT6255: Information Retrieval Text classification.

Afaan Oromo Text Retrieval System

Text Based Information Retrieval - Text Mining PKB - Antonie.

Text Indexing and Retrieval

Information Retrieval & Text Mining - Intranet DEIBhome.deib.polimi.it/.../DMTM/DMTM1112_TextMining.pdf · 2012-06-13 · Information Retrieval & Text Mining Data Mining and Text

CONTENT BASED IMAGE RETRIEVAL SYSTEM BY FUSION OF …granthaalayah.com/Articles/Vol6Iss9/25_IJRG18_A09_1678.pdf · 2018-10-03 · Content Based Image Retrieval (CBIR) system retrieves

Effective Techniques for Indonesia Text Retrieval

Principles of Hash-based Text Retrieval.

1 CS 430 / INFO 430 Information Retrieval Lecture 2 Text Based Information Retrieval.

Probabilistic Retrieval of OCR Degraded Text Using …ccc.inaoep.mx/~villasen/bib/Probabilistic Retrieval of...Probabilistic Retrieval of OCR Degraded Text Using N-Grams S.M. Harding