BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

21
BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION

Transcript of BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

Page 1: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

B Y P H I L I P P C I M I A N O

P R E S E N T E D B Y J O S E P H P A R K

CONCEPT HIERARCHY INDUCTION

Page 2: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

CONCEPT HIERARCHIES

• Structure information into categories

• Provide a level of generalization

• Form the backbone of any ontology

Page 3: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

COMMON APPROACHES

• Machine readable dictionaries

• Lexico-syntactic patterns

• Distributional similarity

• Co-occurrence analysis

Page 4: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

MACHINE READABLE DICTIONARIES

• Exploit regularity of dictionaries• Find a hypernym for the defined word• Head of the first NP (genus or kernel term)

• spring "the season between winter and summer and in which leaves and flowers appear“• hornbeam "a type of tree with a hard wood,

sometimes used in hedges“• launch "a large usu. motor-driven boat used for

carrying people on rivers, lakes, harbors, etc."

Page 5: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

LEXICO-SYNTACTIC PATTERNS

• Hearst patterns• Hearstl: NP such as {NP,}* {(and | or)} NP• Hearst2: such NP as {NP,}* {(and | or)} NP• HearstS: NP {,NP}* {,} or other NP• Hearst4: NP {,NP}* {,} and other NP• Hearst5: NP including {NP,}* NP {(and | or)} NP• Hearst6: NP especially {NP,}* {(and|or)} NP

• They should occur frequently and in many text genres• They should accurately indicate the relation of

interest• They should be recognizable with little or no pre-

encoded knowledge

Page 6: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

EXAMPLE OF USING HEARST PATTERN

• 'Such injuries as bruises, wounds and broken bones...'

• hyponym(bruise, injury)• hyponym(wound, injury)• hyponym(broken bone, injury)

Page 7: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

DISTRIBUTIONAL SIMILARITY

• Distributional hypothesis• Words are similar to the extent they share the same

context• ‘you shall know a word by the company it keeps’ –Firth

Page 8: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

EXAMPLE

Page 9: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

CO-OCCURRENCE ANALYSIS

• Collocation

• Document-based subsumption• a certain term is more special than a term if also

appears in all the documents in which appears

Page 10: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

THREE MORE APPROACHES

• Formal Concept Analysis (FCA)

• Guided Clustering

• Learning from heterogeneous sources of evidence

Page 11: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

FORMAL CONCEPT ANALYSIS

• Set-theoretical approach• Parse corpus (extract dependencies)• Verb-pp-complement• Verb-object• Verb-subject

• Extract surface dependencies (section 4.1.4)

Page 12: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

PSEUDOCODE

Page 13: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

EXAMPLE

Page 14: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

RESULTS

Page 15: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

GUIDED CLUSTERING

• Uses hypernyms from WordNet and Hearst patterns

Page 16: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

EXAMPLE

Page 17: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

RESULTS

Page 18: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

MORE RESULTS

Page 19: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

HETEROGENEOUS SOURCES OF EVIDENCE

• Naïve threshold classifier• Uses Hearst patterns for corpus patterns• Uses Google API for web patterns• Uses Hearst patterns over downloaded pages• Uses WordNet senses• Uses ‘head’-heuristic (r-match)• Uses corpus based subsumption• Uses document based subsumption

Page 20: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

RESULTS

Page 21: BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION.

MORE RESULTS