Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational...

Post on 27-Mar-2015

217 views 2 download

Tags:

Transcript of Diplomvortrag Marc Ehrig, FZI 22.01.2002 Ontology-Focused Crawling of Documents and Relational...

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

System-architektur

UserInteraction

Ontology andMetadataManagement

ComputationPreprocessing

Crawling

Computation

Preprocessorand

Separator

Free-TextLookup

RDFmetadatavalidation

andextraction

RelevancyMeasures

Web CrawlingProcess

InstiantiatedOntology &

Metadata Structure

ResultPresentation and

OntologyEvolvement

Document list

URL listRetrieved

WebDocuments

AnchortextLookup

managing ontology and metadatastructures

defin

est

art U

RLs

inspect

Documentrelevance

Linkrelevance

links textmeta-data

RDF-metadata

user

+++

Maintenance

sele

ct &

para

met

rize

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: SOEP

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Options

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Entities

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Run

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Dokuments

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: URLs

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Words

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Metadata

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Demo: Datei mit RDF-statements<rdf:RDF xmlns:b="http://kaon.semanticweb.org/2001/11/kaon-lexical#"

xmlns:d="http://www.w3.org/2000/01/rdf-schema#"xmlns:h="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#"xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<rdf:Description rdf:about="urn:rdf:6113af897d1e6457528cf08de108541d-A380"h:engine_type="jet"><rdf:type rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#commercial airplane"/>

</rdf:Description>

<rdf:Property rdf:about="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#rival"><d:range rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#company"/><d:domain rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#company"/>

</rdf:Property>

<b:Label rdf:about="urn:rdf:e57f37acd768087d1278de3cbb44f669-label_rival" b:value="rival"><b:references rdf:resource="file:Z:/Programmierung/Local/New/AdditionalFiles/airplane.kaon#rival"/><b:inLanguage rdf:resource="http://kaon.semanticweb.org/2001/11/kaon-lexical#EN"/>

</b:Label></rdf:RDF>

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR, allgemein

0

0,2

0,4

0,6

0,8

1

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Prof. Deshmukh

0

0,2

0,4

0,6

0,8

1

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR

0

0,1

0,2

0,3

0,4

0,5

0,6

0 200 400 600 800 1000 1200 1400

keyword

taxonomic

relational

total

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: CIIR, Relevante Seiten

• http://ciir.cs.umass.edu/index.html• http://ciir.cs.umass.edu• http://ciir.cs.umass.edu/publications/• http://www-ciir.cs.umass.edu/~allan/• http://ciir.cs.umass.edu/personnel/croft.html• http://www.cs.umass.edu/csinfo/techrep.html• http://www-aml.cs.umass.edu/criccs/level2-4.html• http://www-nlp.cs.umass.edu/ciir-pubs/tepubs.html• http://www.cs.umass.edu/csinfo/groups.html• http://www.umass.edu/research/center.html• http://www.cs.umass.edu/autogen/faculty.html• http://www.umass.edu/pride/

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Boeing 747

0

0,2

0,4

0,6

0,8

1

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Boeing 747

0

0,0050,01

0,015

0,020,025

0,03

0,035

0,040,045

0,05

keyword

taxonomic

relational

total

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

Beispiel: Breakwater Hotel

0

0,2

0,4

0,6

0,8

1

Diplomvortrag Marc Ehrig, FZI 22.01.2002Ontology-Focused Crawling of Documents and Relational Metadata

leer