Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a...
Transcript of Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a...
![Page 1: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/1.jpg)
Domain Cartridge: UnsupervisedFramework for Shallow Domain Ontology
Construction from Corpus
Subhabrata MukherjeeJitendra Ajmera, Sachindra Joshi
Max Planck Institute for InformaticsIBM India Research Lab
CIKM 2014
November 17, 2014
![Page 2: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/2.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:I “use sprint zone”
I Parse w/o domain knowledge — use/noun sprint/verb zone/nounI Parse with domain knowledge — use/verb {sprint zone}/noun
I “transfer files via usb cable”
Parser generates noisy or incomplete parse without the domainknowledgeI ‘sprint’ and files’ are not verbsI “sprint zone, usb cable” are multi-word concepts
![Page 3: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/3.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:I “use sprint zone”
I Parse w/o domain knowledge — use/noun sprint/verb zone/nounI Parse with domain knowledge — use/verb {sprint zone}/noun
I “transfer files via usb cable”
Parser generates noisy or incomplete parse without the domainknowledgeI ‘sprint’ and files’ are not verbsI “sprint zone, usb cable” are multi-word concepts
![Page 4: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/4.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:I “use sprint zone”
I Parse w/o domain knowledge — use/noun sprint/verb zone/nounI Parse with domain knowledge — use/verb {sprint zone}/noun
I “transfer files via usb cable”
Parser generates noisy or incomplete parse without the domainknowledgeI ‘sprint’ and files’ are not verbsI “sprint zone, usb cable” are multi-word concepts
![Page 5: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/5.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables system toclarify about Type-Of device
I Query expansionI E.g. Consider Synonyms along with original query, ‘battery’ is a
Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
![Page 6: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/6.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables system toclarify about Type-Of device
I Query expansionI E.g. Consider Synonyms along with original query, ‘battery’ is a
Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
![Page 7: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/7.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables system toclarify about Type-Of device
I Query expansionI E.g. Consider Synonyms along with original query, ‘battery’ is a
Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
![Page 8: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/8.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information, or extraction is difficult(E.g. pdf)
I Challenge is to learn a shallow ontology from raw unannotatedplain text
![Page 9: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/9.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information, or extraction is difficult(E.g. pdf)
I Challenge is to learn a shallow ontology from raw unannotatedplain text
![Page 10: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/10.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information, or extraction is difficult(E.g. pdf)
I Challenge is to learn a shallow ontology from raw unannotatedplain text
![Page 11: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/11.jpg)
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
![Page 12: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/12.jpg)
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Synonyms
![Page 13: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/13.jpg)
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Feature-Of
![Page 14: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/14.jpg)
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Type-Of
![Page 15: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/15.jpg)
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Action-On
![Page 16: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/16.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
![Page 17: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/17.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
![Page 18: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/18.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
![Page 19: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/19.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
![Page 20: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/20.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Corpus: Knowledge articles, manuals, tutorials etc.
Domain Cartridge: Framework
![Page 21: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/21.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
Domain Cartridge: Framework
![Page 22: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/22.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
“Turn the wi-fi radio on or off”
English Slot Grammar (ESG) parser used. 50 - 100 times fasterthan Charniak parser
![Page 23: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/23.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Prismatic Relations
Shallow semantic relationship (SSR) annotation over ESG parseroutput generates normalized parser relation
E.g., “Samsung has a battery" and “Samsung’s battery died"both generate the same relation ‘nnMod:samsung_battery’
![Page 24: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/24.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Prismatic Relations
Shallow semantic relationship (SSR) annotation over ESG parseroutput generates normalized parser relation
E.g., “Samsung has a battery" and “Samsung’s battery died"both generate the same relation ‘nnMod:samsung_battery’
![Page 25: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/25.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Lucene Index – For efficient retrieval of relations, documents,positional information, proximity based queries etc.
![Page 26: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/26.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
![Page 27: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/27.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term DiscoveryESG parser maintains a domain term lexicon of multi-wordconcepts. E.g. “touch screen, sprint navigation”
Noun Phrase Chunking on document titles to extract frequentlyoccuring concepts as domain words
![Page 28: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/28.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term DiscoveryESG parser maintains a domain term lexicon of multi-wordconcepts. E.g. “touch screen, sprint navigation”
Noun Phrase Chunking on document titles to extract frequentlyoccuring concepts as domain words
![Page 29: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/29.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Discovery
I Enrich lexicon and bootstrap parserI Parser generates refined output
High precision but low recall — as titles are precise, clean butshort
To extract more fine-grained domain terms HITS is used onparser output
![Page 30: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/30.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Discovery
I Enrich lexicon and bootstrap parserI Parser generates refined output
High precision but low recall — as titles are precise, clean butshort
To extract more fine-grained domain terms HITS is used onparser output
![Page 31: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/31.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser is ahub generating domain terms
I Any domain term is an authority influenced by incomingfeatures from hubs
I Good authorities incorporated in Parser Domain Term Lexicon
I Parser is re-run, refined relations generated, and previoussteps iterated until convergence
![Page 32: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/32.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser is ahub generating domain terms
I Any domain term is an authority influenced by incomingfeatures from hubs
I Good authorities incorporated in Parser Domain Term Lexicon
I Parser is re-run, refined relations generated, and previoussteps iterated until convergence
![Page 33: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/33.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser is ahub generating domain terms
I Any domain term is an authority influenced by incomingfeatures from hubs
I Good authorities incorporated in Parser Domain Term Lexicon
I Parser is re-run, refined relations generated, and previoussteps iterated until convergence
![Page 34: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/34.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser is ahub generating domain terms
I Any domain term is an authority influenced by incomingfeatures from hubs
I Good authorities incorporated in Parser Domain Term Lexicon
I Parser is re-run, refined relations generated, and previoussteps iterated until convergence
![Page 35: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/35.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
![Page 36: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/36.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feedback
Domain Cartridge: Framework
![Page 37: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/37.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parser Performance Improvement
Number of incomplete parses went down by 73% afterincorporating domain terms in the parser lexicon
![Page 38: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/38.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Terms
software-version htc-evo wi-fi memory-card microsoft-exchange lg-optimus samsung-m400 samsung-galaxy-victorysoftware-updates samsung-array text-messaging touch-screenblackberry-bold
Table: Snapshot of multi-word domain terms extracted by NP Chunking.
optimus-g set-up novatel-wireless e-mail sierra-wireless apple-id google-maps play-music mobile-network 10-digit internet-explorer slacker-radio caller-id google-search address-book my-computer software-update blackberry-id as-well-as windows-update terms-of-service drop-down pro-700 add-on scp-2700mac-os device-manager voice-mail non-camera
Table: Snapshot of multi-word domain terms extracted by HITS (notfound by NP Chunking).
![Page 39: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/39.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
![Page 40: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/40.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Indexing (RI)
For computing word similarity and dimensionality reduction
RI considers “term X term” co-occurrence, as opposed to“term X document” matrix — allowing for incremental learning ofcontext information, scaling up with the corpus size
Relational Distributional Similarity — Two terms are similar if theyappear in a similar context with similar Shallow SemanticRelations
Random Index Vector Update — Neighborhood constitutes ofsyntactic relations between target term and neighboring terms
![Page 41: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/41.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Indexing (RI)
For computing word similarity and dimensionality reduction
RI considers “term X term” co-occurrence, as opposed to“term X document” matrix — allowing for incremental learning ofcontext information, scaling up with the corpus size
Relational Distributional Similarity — Two terms are similar if theyappear in a similar context with similar Shallow SemanticRelations
Random Index Vector Update — Neighborhood constitutes ofsyntactic relations between target term and neighboring terms
![Page 42: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/42.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Indexing (RI)
For computing word similarity and dimensionality reduction
RI considers “term X term” co-occurrence, as opposed to“term X document” matrix — allowing for incremental learning ofcontext information, scaling up with the corpus size
Relational Distributional Similarity — Two terms are similar if theyappear in a similar context with similar Shallow SemanticRelations
Random Index Vector Update — Neighborhood constitutes ofsyntactic relations between target term and neighboring terms
![Page 43: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/43.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
![Page 44: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/44.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
Random Index gives top N similar terms for a given term
HITS gives dominant domain terms and domain (SSR) relations
Sim(wi ,wj) =
∑p Ili=lj ,ki=kj (fwki
,p, fwkj,p′)∑
p∑
r Ili=lr ,ki=kr (fwki,p, fwkr ,p′)
Numerator — #Freq. of common (dominant) words in bothneighborhood with similar dominant SSR relations
Denominator — #Freq. of the common word in any otherneighborhood with similar SSR relation
![Page 45: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/45.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
Random Index gives top N similar terms for a given term
HITS gives dominant domain terms and domain (SSR) relations
Sim(wi ,wj) =
∑p Ili=lj ,ki=kj (fwki
,p, fwkj,p′)∑
p∑
r Ili=lr ,ki=kr (fwki,p, fwkr ,p′)
Numerator — #Freq. of common (dominant) words in bothneighborhood with similar dominant SSR relations
Denominator — #Freq. of the common word in any otherneighborhood with similar SSR relation
![Page 46: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/46.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery (RI)
![Page 47: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/47.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
![Page 48: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/48.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Relation Discovery
ESG SSR relations exploited to discover domain relationbetween two words
Feature-Of typically marked by noun-noun modifications andsubject-object relations
“rel:nnMod:network_life, rel:nnMod:account_settings,rel:svo:phone_access_internet etc."
![Page 49: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/49.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Relation Discovery
Action-On marked by “dm” and verb-object relations
E.g. “rel:svo:tap_add_account, rel:dm_obj:activate_device,rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
Type-Of marked by Hearst patterns like “or, especially” and SSRrelations like “svo:include, npo:like, npo:such-as, npo:as”
E.g. “rel:svo:devices_include_HTC, rel:npo:applications_such-as_WhatsApp, rel:npo:features_like_call,rel:npo:contact_such-as_address".
![Page 50: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/50.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Relation Discovery
Action-On marked by “dm” and verb-object relations
E.g. “rel:svo:tap_add_account, rel:dm_obj:activate_device,rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
Type-Of marked by Hearst patterns like “or, especially” and SSRrelations like “svo:include, npo:like, npo:such-as, npo:as”
E.g. “rel:svo:devices_include_HTC, rel:npo:applications_such-as_WhatsApp, rel:npo:features_like_call,rel:npo:contact_such-as_address".
![Page 51: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/51.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
5000 articles, tutorials and manuals from the smartphone domain
We used the Back-of-the-Book Index (BOI) of manuals, to createground truth for domain term discovery
Baselines:I WordNet (G. A. Miller. Wordnet: A lexical database for english. COMMUNICATIONS OF THE ACM, 38,
1995.)
I BabelNet (R. Navigli and S. P. Ponzetto. BabelNet: Building a very large multilingual semantic network. ACL
’10.)
I Yago (F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. WWW ’07.)
![Page 52: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/52.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
Method RecallWordNet 22.62%NP Chunking on Titles 32.45%HITS 40.87%Yago 43.77%BabelNet 53.74%
Table: Domain term evaluation.
![Page 53: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/53.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Recall of a Question-Answering System
Recall@N With DomainTerm Lexicon
Without domainterm lexicon
recall@1 0.40 0.33recall@2 0.49 0.45
Table: Performance of a QA system with and without domain termlexicon.
Incorporation of domain terms in parser lexicon improves QAsystem performance
1D. Gondek et al. A framework for merging and ranking of answers inDeepQA. IBM Journal of Research and Development, 56(3), 2012.
![Page 54: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/54.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Relation Evaluation
2000 word pairs (500 for each of four categories) are manuallyannotated by two annotators
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
![Page 55: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/55.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery: Distributional Simi-larity Comparison
System Precision Recall F-ScoreYago 38% 32% 34.37%BabelNet, WordNet 83% 31% 45.14%Domain Cartridge (DC) 58% 41% 47.60%DC + WordNet 62% 40% 49.00%DC + ESG Parser Features 65% 39% 49.14%
Table: Precision-Recall comparison of Domain Cartridge(random-indexing, HITS and sim. eqn.) with other systems.
![Page 56: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/56.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery: Comparison with Distributional Simi-
larity Measures in WordNet
WordNet F-ScoreLCH 0.22RES 0.31JCN 0.42PATH 0.42LIN 0.43WUP 0.43LESK 0.45Domain Cartridge 0.49
Table: F-Score comparison of WordNet similarity measures withDomain Cartridge.
![Page 57: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/57.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge Ontology Snapshot
![Page 58: Domain Cartridge: Unsupervised Framework for Shallow ...Unsupervised Framework I Typically for a domain, a lot of knowledge articles, manuals, tutorials etc. are available in a variety](https://reader033.fdocuments.in/reader033/viewer/2022042811/5fa878faec8f9c18875e5ed4/html5/thumbnails/58.jpg)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Conclusions
I Unsupervised framework for shallow domain ontologyconstruction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches