Outline Super-quick review of previous talk More on NER by token-tagging –Limitations of HMMs...
-
Upload
mariah-gaines -
Category
Documents
-
view
219 -
download
2
Transcript of Outline Super-quick review of previous talk More on NER by token-tagging –Limitations of HMMs...
Outline
• Super-quick review of previous talk• More on NER by token-tagging
– Limitations of HMMs– MEMMs for sequential classification
• Review of relation extraction techniques– Decomposition one: NER + segmentation + classifying
segments and entities– Decomposition two: NER + segmentation + classifying pairs
of entities
• Some case studies– ACE– Webmaster
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation N
AME
TITLE ORGANIZATION
Bill Gates
CEO
Microsoft
Bill Veghte
VP
Microsoft
Richard Stallman
founder
Free Soft..
*
*
*
*
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation N
AME
TITLE ORGANIZATION
Bill Gates
CEO
Microsoft
Bill Veghte
VP
Microsoft
Richard Stallman
founder
Free Soft..
*
*
*
*
via token tagging
NER by tagging tokens
Yesterday Pedro Domingos flew to New York.
Yesterday Pedro Domingos flew to New York
Person name: Pedro Domingos Location name: New York
Given a sentence:
2) Identify names based on the entity labels
person name
location name
background
1) Break the sentence into tokens, and classify each token with a label indicating what sort of entity it’s part of:
3) To learn an NER system, use YFCL.
HMM for Segmentation of Addresses
• Simplest HMM Architecture: One state per entity type
CA 0.15
NY 0.11
PA 0.08
… …
Hall 0.15
Wean 0.03
N-S 0.02
… …
[Pilfered from Sunita Sarawagi, IIT/Bombay]
HMMs for Information Extraction
1. The HMM consists of two probability tables• Pr(currentState=s|previousState=t) for s=background, location, speaker, • Pr(currentWord=w|currentState=s) for s=background, location, …
2. Estimate these tables with a (smoothed) CPT• Prob(location|location) = #(loc->loc)/#(loc->*) transitions
3. Given a new sentence, find the most likely sequence of hidden states using Viterbi method:
MaxProb(curr=s|position k)=
Maxstate t MaxProb(curr=t|position=k-1) * Prob(word=wk-1|t)*Prob(curr=s|prev=t)
00 : pm Place : Wean Hall Rm 5409 Speaker : Sebastian Thrun… …
“Naïve Bayes” Sliding Window vs HMMs
GRAND CHALLENGES FOR MACHINE LEARNING
Jaime Carbonell School of Computer Science Carnegie Mellon University
3:30 pm 7500 Wean Hall
Machine learning has evolved from obscurity in the 1970s into a vibrant and popular discipline in artificial intelligence during the 1980s and 1990s. As a result of its success and growth, machine learning is evolving into a collection of related disciplines: inductive concept acquisition, analytic learning in problem solving (e.g. analogy, explanation-based learning), learning theory (e.g. PAC learning), genetic algorithms, connectionist learning, hybrid systems, and so on.
Domain: CMU UseNet Seminar Announcements
Field F1 Speaker: 30%Location: 61%Start Time: 98%
Field F1 Speaker: 77%Location: 79%Start Time: 98%
Design decisions: What are the output symbols (states) ?
What are the input symbols ?
Cohen => “Cohen”, “cohen”, “Xxxxx”, “Xx”, … ?
8217 => “8217”, “9999”, “9+”, “number”, … ?
000.. . . .999
3 -d ig i ts
00000 .. . .99999
5 -d ig i ts
0 ..99 0000 ..9999 000000 ..
O th e rs
N u m b e rs
A .. ..z
C h a rs
a a ..
M u lt i -le tte r
W o rds
. , / - + ? #
D e lim ite rs
A ll
Sarawagi et al: choose best abstraction level using holdout set
What is a symbol?
Ideally we would like to use many, arbitrary, overlapping features of words.
St -1
St
Ot
St+1
Ot +1
Ot -1
identity of wordends in “-ski”is capitalizedis part of a noun phraseis in a list of city namesis under node X in WordNetis in bold fontis indentedis in hyperlink anchor…
…
…part of
noun phrase
is “Wisniewski”
ends in “-ski”
Lots of learning systems are not confounded by multiple, non-independent features: decision trees, neural nets, SVMs, …
What is a symbol?
St -1
St
Ot
St+1
Ot +1
Ot -1
identity of wordends in “-ski”is capitalizedis part of a noun phraseis in a list of city namesis under node X in WordNetis in bold fontis indentedis in hyperlink anchor…
…
…part of
noun phrase
is “Wisniewski”
ends in “-ski”
Idea: replace generative model in HMM with a maxent model, where state depends on observations
...)|Pr( tt xs
What is a symbol?
St -1
St
Ot
St+1
Ot +1
Ot -1
identity of wordends in “-ski”is capitalizedis part of a noun phraseis in a list of city namesis under node X in WordNetis in bold fontis indentedis in hyperlink anchor…
…
…part of
noun phrase
is “Wisniewski”
ends in “-ski”
Idea: replace generative model in HMM with a maxent model, where state depends on observations and previous state
...),|Pr( ,1 ttt sxs
What is a symbol?
St -1 S
t
Ot
St+1
Ot +1
Ot -1
identity of wordends in “-ski”is capitalizedis part of a noun phraseis in a list of city namesis under node X in WordNetis in bold fontis indentedis in hyperlink anchor…
…
…part of
noun phrase
is “Wisniewski”
ends in “-ski”
Idea: replace generative model in HMM with a maxent model, where state depends on observations and previous state history
......),|Pr( ,2,1 tttt ssxs
Ratnaparkhi’s MXPOST
• Sequential learning problem: predict POS tags of words.
• Uses MaxEnt model described above.
• Rich feature set.• To smooth, discard features
occurring < 10 times.
Conditional Markov Models (CMMs) aka MEMMs aka Maxent Taggers vs HMMS
St-1 St
Ot
St+1
Ot+1Ot-1
...
i
iiii sossos )|Pr()|Pr(),Pr( 11
St-1 St
Ot
St+1
Ot+1Ot-1
...
i
iii ossos ),|Pr()|Pr( 11
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation N
AME
TITLE ORGANIZATION
Bill Gates
CEO
Microsoft
Bill Veghte
VP
Microsoft
Richard Stallman
founder
Free Soft..
*
*
*
*
What is “Information Extraction”
Filling slots in a database from sub-segments of text.As a task:
23rd July 2009 05:51 GMT
Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week.
After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater.
As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. …
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
NAME TITLE ORGANIZATION
Stephen HemmingerGreg Kroah-HartmanGreg Kroah-Hartman
Vyatta Novell
Linux Driver Proj.
principal engineerprogrammer lead
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
23rd July 2009 05:51 GMT
Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week.
After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater.
As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. …
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
23rd July 2009 05:51 GMT
Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week.
After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater.
As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. …
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
23rd July 2009 05:51 GMT
Microsoft was in violation of the GPL (General Public License) on the Hyper-V code it released to open source this week.
After Redmond covered itself in glory by opening up the code, it now looks like it may have acted simply to head off any potentially embarrassing legal dispute over violation of the GPL. The rest was theater.
As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. …
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
Does not contain worksAt fact
Does contain worksAt fact
Does not contain worksAt fact
Does contain worksAt fact
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
As revealed by Stephen Hemminger - a principal engineer with open-source network vendor Vyatta - a network driver in Microsoft's Hyper-V used open-source components licensed under the GPL and statically linked to binary parts. The GPL does not permit the mixing of closed and open-source elements. …
Does contain worksAt fact
Stephen Hemminger
principal engineer
Microsoft
Vyatta
Is in the worksAt fact
Is in the worksAt fact
Is in the worksAt fact
Is not in the worksAt fact
NAME TITLE ORGANIZATION
Stephen Hemminger Vyatta principal engineer
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
Because of Stephen Hemminger’s discovery, Vyatta was soon purchased by Microsoft for $1.5 billion… Does contain an acquired fact
Microsoft
Vyatta Is in the acquired fact: role=acquiree
Is in a acquired fact: role=acquirer
Stephen Hemminger Is not in the acquired fact
$1.5 billion Is in the acquired fact: role=price
What is “Information Extraction”
NER + Segment + Classify Segments and EntitiesTechnique 1:
23rd July 2009 05:51 GMT
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
Does contain worksAt fact
(actually two of them)- and that’s a problem
What is “Information Extraction”
NER + Segment + Classify EntityPairs from same segmentTechnique 2:
23rd July 2009 05:51 GMT
Hemminger said he uncovered the apparent violation and contacted Linux Driver Project lead Greg Kroah-Hartman, a Novell programmer, to resolve the problem quietly with Microsoft. Hemminger apparently hoped to leverage Novell's interoperability relationship with Microsoft.
Hemminger
programmer
Microsoft
Novell
Greg Kroah-Hartman
Linux Driver Project
lead
About ACE
• http://www.nist.gov/speech/tests/ace/ and http://projects.ldc.upenn.edu/ace/• The five year mission: “develop technology to extract and characterize meaning
in human language”…in newswire text, speech, and images– EDT: Develop NER for: people, organizations, geo-political entities (GPE),
location, facility, vehicle, weapon, time, value … plus subtypes (e.g., educational organizations)
– RDC: identify relation between entities: located, near, part-whole, membership, citizenship, …
– EDC: identify events like interaction, movement, transfer, creation, destruction and their arguments
Events, entities and mentions
• In ACE there is a distinction between an entity—a thing that exists in the Real World—and an entity mention—which is something that exists in the text (a substring).
• Likewise, and event is something that (will, might, or did) happen in the Real World, and an event mention is some text that refers to that event.– An event mention lives inside a sentence (the “extent”)
• with a “trigger” (or anchor)
– An event mention is defined by its type and subtype (e.g, Life:Marry, Transaction:TransferMoney) and its arguments
– Every argument is an entity mention that has been assigned a role.– Arguments belong to the same event if they are associated with the
same trigger.
• The entity-mention, trigger, extent, argument are markup and also define a possible decomposition of the event-extraction task into subtask.
The Webmaster Project:A Case Study
with Einat Minkov (LTI, now Haifa U), Anthony Tomasic (ISRI)
See IJCAI-2005 paper
Overview and Motivations
• What’s new:– Adaptive NLP components– Learn to adapt to changes in
domain of discourse– Deep analysis in limited but
evolving domain• Compared to past NLP systems:
– Deep analysis in narrow domain (Chat-80, SHRDLU,...)
– Shallow analysis in broad domain (POS taggers, NE recognizers, NP-chunkers, ...)
– Learning used as tool to develop non-adaptive NLP components
• Details:– Assume DB-backed website,
where schema changes over time• No other changes allowed (yet)
– Interaction:• User requests (via NL email)
changes in factual content of website (assume update of one tuple)
• System analyzes request• System presents preview page
and editable form version of request
• Key points:– partial correctness is useful– user can verify correctness (vs
case for DB queries, q/a,...) => source of training data
...something in between...
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
Classification
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
NER
databaseweb pagetemplates
Update Request Construction
preview page
user-editable formversion of request
confirm?
LEARNERoffline
trainingdata
User
Outline
• Training data/corpus– look at feasibility of learning the components that need to be
adaptive, using a static corpus
• Analysis steps:– request type– entity recognition– role-based entity classification– target relation finding– target attribute finding– [request building]
• Conclusions/summary
Training data
User1
User2
User3
....
Mike Roborts should be Micheal Roberts in the staff listing, pls fix it. Thanks - W
On the staff page, change Mike to Michael in the listing for “Mike Roberts”.
Training data
User1
User2
User3
....
Add this as Greg Johnson’s phone number: 412 281 2000
Please add “412-281-2000” to greg johnson’s listing on the staff page.
Training data – entity names are made distinct
User1
User2
User3
....
Add this as Greg Johnson’s phone number: 412 281 2000
Please add “543-341-8999” to fred flintstone’s listing on the staff page.
Modification: to make entity-extraction reasonable, remove duplicate entities by replacing them with alternatives (preserving case, typos, etc)
Training data
User1
User2
User3
....
Request1
Request2
Request3
....
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 2)
message(user 2,req 2)....
message(user 1,req 3)
message(user 2,req 3)....
Training data – always test on a novel user?
User1
User2
User3
....
Request1
Request2
Request3
....
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 2)
message(user 2,req 2)....
message(user 1,req 3)
message(user 2,req 3)....
test
train
train
Simulate a distribution of many users (harder to learn)
Training data – always test on a novel request?
User1
User2
User3
Request1
Request2
Request3
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 3)
message(user 2,req 3)....
test
train
train
message(user 1,req 2)
message(user 2,req 2)....
Simulate a distribution of many requests (much harder to learn) 617 emails total + 96 similar ones
Training data – limitations
• One DB schema, one off-line dataset– May differ from data collected on-line– So, no claims made for tasks where data will be
substantially different (i.e., entity recognition)– No claims made about incremental learning/transfer
• All learning problems considered separate
• One step of request-building is trivial for the schema considered:– Given entity E and relation R, to which attribute of R does E
correspond?– So, we assume this mapping is trivial (general case requires
another entity classifier)
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
Entity Extraction Results
• We assume a fixed set of entity types– no adaptivity needed(unclear if data can be collected)
• Evaluated:– hand-coded rules (approx cascaded FST in “Mixup” language) – learned classifiers with standard feature set and also a “tuned”
feature set, which Einat tweaked– results are in F1 (harmonic avg of recall and precision)– two learning methods, both based on “token tagging”
• Conditional Random Fields (CRF)• Voted-perception discriminative training for an HMM (VP-HMM)
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
Entity Classification Results
• Entity “roles”:– keyEntity: value used to
retrieve a tuple that will be updated (“delete greg’s phone number”)
– newEntity: value to be added to database (“William’s new office # is 5307 WH”).
– oldEntity: value to be overwritten or deleted (“change mike to Michael in the listing for ...”)
– irrelevantEntity: not needed to build the request (“please add .... – thanks, William”)
Features:
• closest preceding preposition
• closest preceding “action verb” (add, change, delete, remove, ...)
• closest preceding word which is a preposition, action verb, or determiner (in “determined” NP)
• is entity followed by ‘s
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
Reasonable results with “bag of words” features.
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
Request type classification:addTuple, alterValue, deleteTuple, or deleteValue?
• Can be determined from entity roles, except for deleteTuple and deleteValue.
– “Delete the phone # for Scott” vs “Delete the row for Scott”
• Features:– counts of each entity role
– action verbs
– nouns in NPs which are (probably) objects of action verb
– (optionally) same nouns, tagged with a dictionary
• Target attributes are similar
Comments:
• Very little data is available
• Twelve words of schema-specific knowledge: dictionary of terms like phone, extension, room, office, ...
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
Training data
User1
User2
User3
....
Request1
Request2
Request3
....
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 2)
message(user 2,req 2)....
message(user 1,req 3)
message(user 2,req 3)....
Training data – always test on a novel user?
User1
User2
User3
....
Request1
Request2
Request3
....
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 2)
message(user 2,req 2)....
message(user 1,req 3)
message(user 2,req 3)....
test
train
train
Simulate a distribution of many users (harder to learn)
Training data – always test on a novel request?
User1
User2
User3
Request1
Request2
Request3
message(user 1,req 1)
message(user 2,req 1)....
message(user 1,req 3)
message(user 2,req 3)....
test
train
train
message(user 1,req 2)
message(user 2,req 2)....
Simulate a distribution of many requests (much harder to learn) 617 emails total + 96 similar ones
Conclusions?
• System architecture allows all schema-dependent knowledge to be learned– Potential to adapt to changes in schema– Data needed for learning can be collected from user
• Learning appears to be possible on reasonable time-scales – 10s or 100s of relevant examples, not thousands– Schema-independent linguistic knowledge is useful
• F1 is eighties is possible on almost all subtasks.– Counter-examples are rarely changed relations (budget) and
distinctions for which little data is available• There is substantial redundancy in different subtasks
– Opportunity for learning suites of probabilistic classifiers, etc
• Even an imperfect IE system can be useful….– With the right interface…
POS tags
NP chunks
words, ...
feat
ure
s
entity1, entity2, ....
email msg
Shallow NLP Feature Building
C
C
C
C
requestType
targetRelation
targetAttrib
newEntity1,...
oldEntity1,...
keyEntity1,...
otherEntity1,...
InformationExtraction
micro- form
query DB
Webmaster: the Epilog (VIO)
• Faster for request-submitter
• Zero time for webmaster
• Zero latency
• More reliable (!)
Tomasic et al, IUI 2006
• Entity F1 ~= 84,
• Micro-form selection accuracy =~ 80
• Used UI for experiments on real people (human-human, human-VIO)
Conclusions and comments
• Two case studies of non-trivial IE pipelines illustrate:– In any pipeline, errors propogate– What’s the right way of training components in a pipeline?
Independently? How can (and when should) we make decisions using some flavor of joint inference?
• Some practical questions for pipeline components:– What’s downstream? What do errors cost?– Often we can’t see the end of the pipeline…
• How robust is the method ?– new users, new newswire sources, new upsteam components…– Do different learning methods/feature sets differ in robustness?
• Some concrete questions for learning relations between entities:– (When) is classifying pairs of things the right approach? How do
you represent pairs of objects? How to you represent structure, like dependency parses? Kernels? Special features?