Syntax
description
Transcript of Syntax
![Page 1: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/1.jpg)
Syntax
• The study of how words are ordered and grouped together
• Key concept: constituent = a sequence of words that acts as a unit
hethe man
the short manthe short man with the large hat
went
hometo his houseout of the carwith her}{
![Page 2: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/2.jpg)
Phrase StructureS
NP
PN
VP
VBD NP PP
PRP NP
She saw a tall man with a telescope
![Page 3: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/3.jpg)
Noun Phrases• Contains a noun plus descriptors, including:
– Determiner: the, a, this, that– Adjective phrases: green, very tall– Head: the main noun in the phrase– Post-modifiers: prepositional phrases or relative
clauses
That old green couch of yours that I want to throw out
det adj adj head PP relative clause
![Page 4: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/4.jpg)
Verb Phrases• Contains a verb (the head) with modifiers
and other elements that depend on the verb
want to throw out
head PP
previously saw the man in the park with her telescope
adv head direct object PP
might have showed his boss the code yesterday
indirectobject DObjheadauxmodal adverb
![Page 5: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/5.jpg)
Prepositional Phrases
• Preposition as head and NP as complement
with her grey poodle
head complement
Adjective Phrases
• Adjective as head with modifiers
extremely sure that he would win
head relative clauseadv
![Page 6: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/6.jpg)
Shallow Parsing
• Extract phrases from text as ‘chunks’• Flat, no tree structures• Usually based on patterns of POS tags• Full parsing conceived of two steps:
– Chunking / Shallow parsing– Attachment of chunks to each other
![Page 7: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/7.jpg)
Noun Phrases• Base Noun Phrase: A noun phrase that
does not contain other noun phrases as a component
• Or, no modification to the right of the heada large green cowThe United States Governmentevery poor shop-owner’s dream ?other methods and techniques ?
![Page 8: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/8.jpg)
Manual Methodology
• Build a regular-expression over POS• E.g:
DT? (ADJ | VBG)* (NN)+
• Very hard to do accurately• Lots of manual labor• Cannot be easily tuned to a specific corpus
![Page 9: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/9.jpg)
Chunk Tags
• Represent NPs by tags:[the tall man] ran with [blinding speed]DT ADJ NN1 VBD PRP VBG NN0
I I I O O I I• Need B tag for adjacent NPs:On [Tuesday] [the company] went bankruptO I B I O O
![Page 10: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/10.jpg)
Transformational Learning• Baseline tagger:
– Most frequent chunk tag for POS or word• Rule templates (100 total):current word/POS current ctagword/POS 1 on left/right current and left ctag
current and left/right word/POS current and right ctag
word/POS on left and on right in two ctags to left
in two words/POSs on left/right in two ctags to right
in three words/POSs on left/right
![Page 11: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/11.jpg)
Some Rules Learned
1. (T1=O, P0=JJ) I O 2. (T-2=I, T-1=I, P0=DT) B3. (T-2=O, T-1=I, P-1=DT) I4. (T-1=I, P0=WDT) I B5. (T-1=I, P0=PRP) I B6. (T-1=I, W0=who) I B7. (T-1=I, P0=CC, P1=NN) O I
![Page 12: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/12.jpg)
ResultsTraining Prec. Recall Tag Acc.Baseline 78.2 81.9 94.550K 89.8 90.4 96.9100K 91.3 91.8 97.2200K 91.8 92.3 97.4200K nolex 90.5 90.7 97.0950K 93.1 93.5 97.8
• Precision = fraction of NPs predicted that are correct• Recall = fraction of actual NPs that are found
![Page 13: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/13.jpg)
Memory-Based Learning
• Match test data to previously seen data and classify based on the most similar previously seen instances
• E.g:
{the saw wasshe saw theboy saw three
boy saw the
boy ate the
![Page 14: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/14.jpg)
k-Nearest Neighbor (kNN)
• Find k most similar training examples• Let them ‘vote’ on the correct class for the
test example– Weight neighbors by distance from test
• Main problem: defining ‘similar’– Shallow parsing – overlap of words and POS– Use feature weighting...
![Page 15: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/15.jpg)
Information Gain• Not all features are created equal (e.g. saw
in previous example is more important)• Weight the features by information gain
= how much does f distinguish different classes
Xx
i
fVv ii
i
xPxPXH
fVH
vfCHvfPCHfw i
)(log)()(
))((
)|()()()(
2
)(
![Page 16: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/16.jpg)
C1
C2
C3
C4
high information gainlow information gain
![Page 17: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/17.jpg)
Base Verb Phrase
• Verb phrase not including NPs or PPs
[NP Pierre Vinken NP] , [NP 61 years NP] old ,[VP will soon be joining VP] [NP the board NP] as [NP a nonexecutive director NP] .
![Page 18: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/18.jpg)
Results• Context:
2 words and POS on left and 1 word and POS on right
Task Context Prec. Recall Acc.bNP curr. word 76 80 93
curr. POS 80 82 952 – 1 94 94 98
bVP curr. word 68 73 96curr. POS 75 89 972 – 1 94 96 99
![Page 19: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/19.jpg)
Efficiency of MBL
• Finding the neighbors can be costly• Possibility:
Build decision tree based on information gain of features to index data = approximate kNN
W0
P-2P-1W-1
saw the boy
![Page 20: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/20.jpg)
MBSL• Memory-based technique relying on
sequential nature of the data– Use “tiles” of phrases in memory to “cover” a
new candidate (and context), and compute a tiling score
went to the white house for dinnerVBD PRP [[ DT ADJ NN1 ]] PRP NN1
PRP [NP DT
[NP DT ADJ NN1
NN1 NP] PRP
PRP [NP DT ADJ
ADJ NN1 NP]
![Page 21: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/21.jpg)
Tile Evidence• Memory:
[NP DT NN1 NP] VBD [NP DT NN1 NN1 NP] [NP NN2 NP] .[NP ADJ NN2 NP] AUX VBG PRP [NP DT ADJ NN1 NP] .
• Some tiles: [NP DT pos=3 neg=0 [NP DT NN1 pos=2 neg=0DT NN1 NP] pos=1 neg=1NN1 NP] pos=3 neg=1NN1 NP] VBD pos=1 neg=0
• Score tile t by ft(t) = pos / total, Only keep tiles that pass a threshhold ft(t) >
![Page 22: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/22.jpg)
Covers• Tile t1 connects to t2 in a candidate if:
– t2 starts after t1
– there is no gap between them (may be overlap)– t2 ends after t1
VBD PRP [[ DT ADJ NN1 ]] PRP NN1PRP [NP DT
[NP DT ADJ
NN1 NP] PRP
•A sequence of tiles covers a candidate if–each tile connects to the next
–the tiles collectively match the entire candidate including brackets and maybe some context
![Page 23: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/23.jpg)
Cover Graph
VBD PRP [[ DT ADJ NN1 ]] PRP NN1
PRP [NP DT
[NP DT ADJ NN1
NN1 NP] PRP
PRP [NP DT ADJ
ADJ NN1 NP]
START END
![Page 24: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/24.jpg)
Measures of ‘Goodness’• Number of different covers• Size of smallest cover (fewest tiles)• Maximum context in any cover (left + right)• Maximum overlap of tiles in any cover• Grand total positive evidence divided by
grand total positive+negative evidence
Combine these measures by linear weighting
![Page 25: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/25.jpg)
Scoring a Candidate
CandidateScore(candidate, T)• G CoverGraph(candidate, T)• Compute statistics by DFS on G• Compute candidate score as linear function
of statistics
Complexity (O(l) tiles in candidate of length l):– Creating the cover graph is O(l2)– DFS is O(V+E)=O(l2)
![Page 26: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/26.jpg)
Full AlgorithmMBSL(sent, C, T)1. For each subsequence of sent, do:
1. Construct a candidate s by adding brackets [[ and ]] before and after the subsequence
2. fC(s) CandidateScore(s, T)3. If fC(s) > C, then add s to candidate-set
2. For each c in candidate-set in decreasing order of fC(c), do:
1. Remove all candidates overlapping with c from candidate-set
3. Return candidate-set as target instances
![Page 27: Syntax](https://reader033.fdocuments.in/reader033/viewer/2022042719/56814bf4550346895db8e5c9/html5/thumbnails/27.jpg)
ResultsTargetType
Contextsize
T Prec. Recall
NP 3 0.6 92 92
SV 3 0.6 89 85
VO 2 0.5 77 90