13-DependencyParsing.pptxnnnnnnnnnnnnnnnnnnn

RAtna29 10 views 60 slides Oct 13, 2024
Slide 1
Slide 1 of 60
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60

About This Presentation

j


Slide Content

1 CSC 594 Topics in AI – Natural Language Processing Spring 2016/17 13. Dependency Parsing (Some slides adapted from Jurafsky & Martin)

Dependency Parsing A different way of capturing syntax Emphasis on dependency and grammatical relations Subject, direct object, case, etc. Constituency does not play a major role More suited to languages with complex morpho -syntax or free word order (e.g. Czech, Japanese) Significant computational advantages Linear vs. O(N 5 ) for probabilistic CFGs Speech and Language Processing - Jurafsky and Martin 2

Dependency Parse I booked a morning flight. (booked, I) (booked, flight) (flight, a) (flight, morning) ROOT Speech and Language Processing - Jurafsky and Martin 3

Tree Constraints Words can only have one head. One incoming arc Every word has to have a head. The structure is connected and rooted. Connections are dependency relations . Result is a tree There’s a path from the root to each word. There’s only one path from the root to any word. Speech and Language Processing - Jurafsky and Martin 4

Dependency Relations Speech and Language Processing - Jurafsky and Martin 5

Relation to Head-Finding Workers dumped sacks into a bin Speech and Language Processing - Jurafsky and Martin 6

Dependency Relations Speech and Language Processing - Jurafsky and Martin 7

Transition-Based Parsing Transition-based parsing is a greedy word-by-word approach to parsing A single dependency tree is built up an arc at a time as we move left to right through a sentence No backtracking ML-based classifiers are used to make decisions as we move through the sentence Speech and Language Processing - Jurafsky and Martin 8

Transition-Based Parsing We can (again) view this as a search through a set of states for a state that contains what we want In the standard notation a state consists of three elements A stack representing partially processed words A list containing the remaining words to be processed A set containing the relations discovered so far Speech and Language Processing - Jurafsky and Martin 9

States T he start state looks like this [[root], [word list], ()] A valid final state looks like [[root], [] (R)] Where R is the set of relations that we’ve discovered. The [] Ian empty list) represents the fact that all the words in the sentence are accounted for. Speech and Language Processing - Jurafsky and Martin 10

Example Start [[root], [I booked a morning flight], ()] End [[root], [], ((booked, I) (booked, flight) (flight, a) (flight, morning)) ] Speech and Language Processing - Jurafsky and Martin 11

Parsing The parsing problem is how to get from the start state to the final state To begin, we’ll define a set of three basic operators that take a state and produce a new state Left Right Shift Speech and Language Processing - Jurafsky and Martin 12

Speech and Language Processing - Jurafsky and Martin 13 Parsing

Shift Shift operator takes the next word to be processed and pushes it onto the stack and removes it from the list. So a shift for our example at the start looks like this [[root], [ I booked a morning flight], ()]  [[root, I ], [booked a morning flight], ()] Speech and Language Processing - Jurafsky and Martin 14

Left The Left operator Adds relation ( a , b ) to the set of relations where a is the first word on the word list b is the word at the top of the stack Pops the stack So for our current state [[root, I ], [ booked a morning flight], ()]  [[root], [booked a morning flight], (booked, I ) ] Speech and Language Processing - Jurafsky and Martin 15

Right The Right operator Adds (b, a) to the set of relations Where b and a are the same as before: a is the first word in the word list, and b is the top of the stack Removes the first word from the word list Pops the stack and places the popped item back at the front of the remaining word list Speech and Language Processing - Jurafsky and Martin 16

Example Operation New State {[root], [I booked a morning flight], []} Speech and Language Processing - Jurafsky and Martin 17

Example Operation New State {[root], [I booked a morning flight], []} Shift Speech and Language Processing - Jurafsky and Martin 18

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Speech and Language Processing - Jurafsky and Martin 19

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left Speech and Language Processing - Jurafsky and Martin 20

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Speech and Language Processing - Jurafsky and Martin 21

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Speech and Language Processing - Jurafsky and Martin 22

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Speech and Language Processing - Jurafsky and Martin 23

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Speech and Language Processing - Jurafsky and Martin 24

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Speech and Language Processing - Jurafsky and Martin 25

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Left {[root, booked], [flight], [(booked, I), (flight, morning), (flight, a)]} Speech and Language Processing - Jurafsky and Martin 26

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Left {[root, booked], [flight], [(booked, I), (flight, morning), (flight, a) Right Speech and Language Processing - Jurafsky and Martin 27

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Left {[root, booked], [flight], [(booked, I), (flight, morning), (flight, a) Right {[root], [booked], [(booked, I), (flight, morning), (flight, a), (booked, flight)] } Speech and Language Processing - Jurafsky and Martin 28

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Left {[root, booked], [flight], [(booked, I), (flight, morning), (flight, a) Right {[root], [booked], [(booked, I), (flight, morning), (flight, a), (booked, flight)] } Right {[], [root], [(booked, I), (flight, morning), (flight, a), (booked, flight), (root, booked)] } Speech and Language Processing - Jurafsky and Martin 29

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left {[root], [booked a morning flight], [(booked, I)]} Shift {[root, booked]}, [a morning flight], [(booked, I)]} Shift {[root, booked, a], [morning flight], [(booked, I)]} Shift {[root, booked, a, morning], [flight], [(booked, I)]} Left {[root, booked, a], [flight], [(booked, I), (flight, morning)]} Left {[root, booked], [flight], [(booked, I), (flight, morning), (flight, a) Right {[root], [booked], [(booked, I), (flight, morning), (flight, a), (booked, flight)] } Right {[], [root], [(booked, I), (flight, morning), (flight, a), (booked, flight), (root, booked)] } Shift {[root], [], [(booked, I), (flight, morning), (flight, a), (booked, flight), (root, booked)] } Speech and Language Processing - Jurafsky and Martin 30

Two Issues W e really want labeled relations That is, we want things like subject, direct object, indirect object, etc. as relations H ow did we know which operator (L, R, S) to invoke at each step along the way? Since we’re not backtracking, one wrong step and we won’t get the tree we want. How do we even know what tree we want to begin with? Speech and Language Processing - Jurafsky and Martin 31

Grammatical Relations Well, to handle this we can just add new transitions Essentially replace Left and Right with { Left, Right} X {all the relations of interest} Note this isn’t going to make problem 2 any easier to deal with Standard approaches have roughly 40 kinds of dependency relations Speech and Language Processing - Jurafsky and Martin 32

Example Operation New State {[root], [I booked a morning flight], []} Speech and Language Processing - Jurafsky and Martin 33

Example Operation New State {[root], [I booked a morning flight], []} Shift Speech and Language Processing - Jurafsky and Martin 34

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Speech and Language Processing - Jurafsky and Martin 35

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left Speech and Language Processing - Jurafsky and Martin 36

Example Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left_Subj {[root], [booked a morning flight], [( subj , booked, I)]} Speech and Language Processing - Jurafsky and Martin 37

Parsing Speech and Language Processing - Jurafsky and Martin 38

Making Choices Method 1 Use a set of rules that choose an operator based on features of the current state As in, if the word at the top of the stack is “I” and the rest of the stack is just “root” and the word at the front of the word list is “booked”, then invoke Left_Subj Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left_Subj {[root], [booked a morning flight], [( subj , booked, I)]} Speech and Language Processing - Jurafsky and Martin 39

Making Choices Method 1 Use a set of rules that choose an operator based on features of the current state As in, if there’s a pronoun at the top of the stack and the rest of the stack is just root and there’s a verb at the front of the word list, then invoke Left_Subj Operation New State {[root], [I booked a morning flight], []} Shift {[root, I], [booked a morning flight], []} Left_Subj {[root], [booked a morning flight], [( subj , booked, I)]} Speech and Language Processing - Jurafsky and Martin 40

Making Choices Method 2 Use supervised machine learning (ML) to train a classifier to choose among the available operators (L, R, S) Based on features derived from the states (configurations) Then use that classifier to make the right choices during parsing Speech and Language Processing - Jurafsky and Martin 41

Three Problems To apply ML in situations like this we have three problems Discovering features that are useful indicators of what to do in any situation Characteristics of the state we’re in Acquiring the necessary training data Treebanks Training Speech and Language Processing - Jurafsky and Martin 42

Three Problems: Features Features are typically described along two dimensions in this style of parsing Position in the state (aka configuration) Position in the stack, position in the word list, location in the partial tree Attributes of particular locations or attributes of tuples of locations Part of speech of the top of the stack, POS of the third word in the remainder list, lemmas at particular positions, last three letters, etc. Head word of a word, number of relations already attached to a word, does the word already have a SUBJ relation, etc. Speech and Language Processing - Jurafsky and Martin 43

Feature Templates If we think that a feature like “the word at the top of the stack AND the POS tag at the front of the buffer” is useful… we don’t want to have to generate features like If word at S_1 is “booked” and B_1 tag is NN” and OP=RIGHT by hand. Instead we’ll use “ feature templates ” that generate millions of features from the training data Speech and Language Processing - Jurafsky and Martin 44

Feature Templates Given a template like <Stack_1.word & Buffer_1.POS, op> Run through the training data and instantiate an instance of this feature for every training instance with the word and POS and operator filled in. <Stack_1.booked & Buffer_1.NN, Left> Speech and Language Processing - Jurafsky and Martin 45

Three Problems: Data Training data First get a treebank Directly as a dependency treebank from human annotators Or derived from a phrase-structure treebank . CFG-style tree-banks can be automatically stripped down to equivalent dependency trees (since all that’s needed are the dependency/head relations Speech and Language Processing - Jurafsky and Martin 46

Three Problems: Training During training we’ll parse with our standard algorithm, asking an oracle which operator to use at any given time. The oracle has access to the correct tree for this sentence. At each stage it chooses as a case statement Left if the resulting relation is in the correct tree. Right if the resulting relation is in the correct tree AND if all the other outgoing relations associated with the dependent are already in the relation list. Otherwise shift Speech and Language Processing - Jurafsky and Martin 47

Three Problems: Training During training we’ll parse with our standard algorithm, asking an oracle which operator to use at any given time. The oracle has access to the correct tree for this sentence. At each stage it chooses as a case statement Left if the resulting relation is in the correct tree. Right if the resulting relation is in the correct tree AND if all the other outgoing relations associated with the dependent are already in the relation list . Otherwise shift Speech and Language Processing - Jurafsky and Martin 48

Example [root, cancelled] [flights, to, Houston] [(canceled United) (flights morning) (flights the)] Speech and Language Processing - Jurafsky and Martin 49

Example Left ? Speech and Language Processing - Jurafsky and Martin 50 [root, cancelled] [flights, to, Houston] [(canceled United) (flights morning) (flights the)]

Example Right ? Speech and Language Processing - Jurafsky and Martin 51 [root, cancelled] [flights, to, Houston] [(canceled United) (flights morning) (flights the)]

Example Shift Speech and Language Processing - Jurafsky and Martin 52 [root, cancelled] [flights, to, Houston] [(canceled United) (flights morning) (flights the)]

Evaluation If we have a test set from a treebank and if we represent parses as a list of relations Unlabeled attachment score (UAS) is just what fraction of words were assigned the right head. Labeled attachment score (LAS) is what fraction of words were assigned the right head with the right relation. (booked, I) (booked, flight) (flight, a) (flight, morning) Speech and Language Processing - Jurafsky and Martin 53

Other Parsing Methods Of course, there are other ways Parsing approaches that use some form of backtracking Grammar-based approaches using CKY-type algorithms Graph-based approaches based on weighted spanning trees Speech and Language Processing - Jurafsky and Martin 54

Graph-Based Intuition United canceled the flight Start with a fully connected graph Speech and Language Processing - Jurafsky and Martin 55

Graph-Based Intuition United canceled the flight United canceled the flight United canceled the flight United canceled the flight Find all the spanning trees Speech and Language Processing - Jurafsky and Martin 56

Graph-Based Intuition United canceled the flight United canceled the flight United canceled the flight United canceled the flight Score them and argmax Speech and Language Processing - Jurafsky and Martin 57

Graph-Based Intuition United canceled the flight Score them and argmax Speech and Language Processing - Jurafsky and Martin 58

Graph-Based Approaches Use a weighted spanning tree algorithm Weights are based on features of the arcs trained from a ( treebank ) training set Based on features of the words and arcs that occur in the training data Slower than transition-based approaches, but higher accuracy on “longer” dependencies Speech and Language Processing - Jurafsky and Martin 59

Transition First we did words (morphology) Then simple sequences of words Then we looked at syntax Now we ’ re moving on to meaning Where some would say we should have started to begin with. Speech and Language Processing - Jurafsky and Martin 60
Tags