Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin This slide set was adapted...

download Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based.

If you can't read please download the document

Transcript of Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin This slide set was adapted...

  • Slide 1
  • Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada Mihalceas original slides
  • Slide 2
  • Slide 1 Parsing Parsing with CFGs refers to the task of assigning correct trees to input strings Correct here means a tree that covers all and only the elements of the input and has an S at the top It doesnt actually mean that the system can select the correct tree from among the possible trees As with everything of interest, parsing involves a search that involves the making of choices
  • Slide 3
  • Slide 1 Some assumptions.. Assume You have all the words already in some buffer The input is/isnt pos tagged All the words are known These are all (quite) feasible State-of-the art in POS tagging? all words are known ?
  • Slide 4
  • Slide 1 Top-Down Parsing Since we are trying to find trees rooted with an S (Sentences) start with the rules that give us an S. Then work your way down from there to the words.
  • Slide 5
  • Slide 1 Top Down Space
  • Slide 6
  • Slide 1 Bottom-Up Parsing Of course, we also want trees that cover the input words. So start with trees that link up with the words in the right way. Then work your way up from there.
  • Slide 7
  • Slide 1 Bottom-Up Space
  • Slide 8
  • Slide 1 Top-Down VS. Bottom-Up Top-down Only searches for trees that can be answers But suggests trees that are not consistent with the words Guarantees that tree starts with S as root Does not guarantee that tree will match input words Bottom-up Only forms trees consistent with the words Suggest trees that make no sense globally Guarantees that tree matches input words Does not guarantee that parse tree will lead to S as a root Combine the advantages of the two by doing a search constrained from both sides (top and bottom)
  • Slide 9
  • Slide 1 Top-Down, Depth-First, Left-to-Right Search + Bottom-up Filtering
  • Slide 10
  • Slide 1 Example (contd)
  • Slide 11
  • Slide 1 Example (contd) flight
  • Slide 12
  • Slide 1 Example (contd) flight
  • Slide 13
  • Slide 1 Possible Problem: Left-Recursion What happens in the following situation S -> NP VP S -> Aux NP VP NP -> NP PP NP -> Det Nominal With the sentence starting with Did the flight
  • Slide 14
  • Slide 1 Solution: Rule Ordering S -> Aux NP VP S -> NP VP NP -> Det Nominal NP -> NP PP The key for the NP is that you want the recursive option after any base case.
  • Slide 15
  • Slide 1 Avoiding Repeated Work Parsing is hard, and slow. Its wasteful to redo stuff over and over and over. Consider an attempt to top-down parse the following as an NP A flight from Indianapolis to Houston on TWA
  • Slide 16
  • Slide 1 flight
  • Slide 17
  • Slide 1 flight
  • Slide 18
  • Slide 1
  • Slide 19
  • Slide 20
  • Dynamic Programming We need a method that fills a table with partial results that Does not do (avoidable) repeated work Does not fall prey to left-recursion Solves an exponential problem in (approximately) polynomial time
  • Slide 21
  • Slide 1 Earley Parsing Fills a table in a single sweep over the input words Table is length N+1; N is number of words Table entries represent Completed constituents and their locations In-progress constituents Predicted constituents
  • Slide 22
  • Slide 1 States The table-entries are called states and are represented with dotted- rules. S -> VPA VP is predicted NP -> Det NominalAn NP is in progress VP -> V NP A VP has been found
  • Slide 23
  • Slide 1 States/Locations It would be nice to know where these things are in the input so S -> VP [0,0]Predictor A VP is predicted at the start of the sentence NP -> Det Nominal[1,2]Scanner An NP is in progress; the Det goes from 1 to 2 VP -> V NP [0,3]Completer A VP has been found starting at 0 and ending at 3
  • Slide 24
  • Slide 1 Graphically
  • Slide 25
  • Slide 1 Earley As with most dynamic programming approaches, the answer is found by looking in the table in the right place. In this case, there should be an S state in the final column that spans from 0 to n+1 and is complete. If thats the case youre done. S -> [0,n+1] So sweep through the table from 0 to n+1 Predictor: New predicted states are created by states in current chart Scanner: New incomplete states are created by advancing existing states as new constituents are discovered Completer: New complete states are created in the same way.
  • Slide 26
  • Slide 1 Earley More specifically 1. Predict all the states you can upfront 2. Read a word Extend states based on matches Add new predictions Go to 2 3. Look at N+1 to see if you have a winner
  • Slide 27
  • Slide 1 Earley and Left Recursion So Earley solves the left-recursion problem without having to alter the grammar or artificially limit the search Never place a state into the chart thats already there Copy states before advancing them S -> NP VP NP -> NP PP The first rule predicts S -> NP VP [0,0]that adds NP -> NP PP [0,0] stops there since adding any subsequent prediction would be fruitless When a state gets advanced make a copy and leave the original alone Say we haveNP -> NP PP [0,0] We find an NP from 0 to 2 so we create NP -> NP PP [0,2] But we leave the original state as is
  • Slide 28
  • Slide 1 Example Book that flight We should find an S from 0 to 3 that is a completed state
  • Slide 29
  • Slide 1 Example (contd)
  • Slide 30
  • Slide 1 Example (contd)
  • Slide 31
  • Slide 1 Example (contd)