NLP2 7
NLP2 7
DSECL ZG565
Prof.Vijayalakshmi
BITS Pilani BITS-Pilani
Pilani Campus
BITS Pilani
Pilani Campus
Session 2
Date – 6th Sep2020
Time – 9 am to 11 am
These slides are prepared by the instructor, with grateful acknowledgement of James
Allen and many others who made their course materialsfreely availableonline.
Session#2 – N gram Language model
1.Machine Translation:
Machine translation system is used to translate
one language to another language .For example
Chinese to English or German to English etc.
2.Spell correction
• More variables:
P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C)
• The Chain Rule in General
P(x1,x2,x3,…,xn) = P(x1)P(x2|x1)P(x3|x1,x2)…P(xn|xn-1,xn-2……,x1)
In a factory there are 100 units of a certain product, 5 of which are defective. We
pick three units from the 100 units at random. What is the probability that none of
them are defective?
Let Ai as the event and i=1,2,3
• Simplifying assumption:
limit history of fixed number of wordsN1
Andrei Markov
or
P(the | its water is so transparent that) » P(the | transparent that)
Example:
P(I want to eat Chinese food)≈ P(I)P(want)
P(to)P(eat)P(Chinese)P(food)
Example:
P(I want to eat Chinese food)≈ P(I|<start>) P(want|I) P(to|want)
P(eat|to) P(Chinese|eat) P(food|Chinese)
P(<end>|food)
Evaluation
• Extrinsic evaluation
– Time-consuming; can take days or weeks
– Bad approximation
• unless the test data looks just like the training data
• So generally only useful in pilot experiments
• So
– Sometimes use intrinsic evaluation: perplexity
Chain rule:
For bigrams:
Suppose we have vocabulary ok k word, and our model assigns probability 1/k
to each word. for a sentence consisting of N random words:
Simple interpolation
• Efficiency
– Efficient data structures like tries
– Bloom filters: approximate language models
– Store words as indexes, not strings
• Use Huffman coding to fit large numbers of words into two bytes
ì
count(wi-i k+1 ) i
ïï if count(wi- k+1 )>0
i- 1
S(wi | w
i- k+1 )=í count(wi-i- 1k+1 )
ï
ïî 0.4S(wi | wi-i- 1k+2 ) otherwise
count(wi )
S(wi ) =
N
• Q&A
• Suggestions / Feedback
Parsing
Recovering syntactic structures requires correct
POS tags
Eg:
Word Disambiguation
-ability to determine which meaning of word is
activated by the use of word in a particular
context.
Eg:
• I can hear bass sound.
• He likes to eat grilled bass.
Ambiguity
Eg:
• He will race/VB the car.
• When will the race/NOUN end?
• The boat floated/VBD … down the river sank
Average of ~2 parts of speech for each word
Examples:
– prepositions: on, under, over, …
– particles: up, down, on, off, …
– determiners: a, an, the, …
– pronouns: she, who, I, ..
– conjunctions: and, but, or, …
– auxiliary verbs: can, may should, …
– numerals: one, two, three, third, …
Background
• From the early 90s
• Developed at the University of Pennsylvania
• (Marcus,Santorinin and Marcinkiewicz 1993)
Size
• 40000 training sentences
• 2400 test sentences
• Genre
• Mostly wall street journal news stories and some spoken
conversations
Importance
• Helped launch modern automatic parsing methods.
Statistical approaches
• Use training corpus to compute probability of a tag
in a context-HMM tagger
Markov Property
• Second-order HMM
– Current state only depends on previous 2 states
• Example
– Trigram model over POS tags
𝑛
– 𝑃 𝒕 = Π𝑖=1 𝑃 𝑡𝑖 ∣ 𝑡𝑖−1 , 𝑡𝑖−2
𝑛
– 𝑃 𝒘, 𝒕 = Π𝑖=1 𝑃 𝑡𝑖 ∣ 𝑡𝑖−1 , 𝑡𝑖−2 𝑃(𝑤𝑖 ∣ 𝑡𝑖 )
• Sequence is
warm-warm-warm-warm
• And state sequence is
3-3-3-3
• P(3,3,3,3) =
– 3a33a33a33a33 = 0.2 x (0.6)3 = 0.0432
= PROB(T1,…Tn | w1,…wn)
Bayes Rule:
PROB(A | B) = PROB(B | A) * PROB(A) / PROB(B)
Rewriting:
PROB(w1,…wn | T1,…Tn) * PROB(T1,…Tn) / PROB(w1,…wn)
Bigram Probabilities
• Bigram(Ti, Tj) Count(i, i + 1) Prob(Tj|Ti) Tag Frequencies
• φ,ART 213 .71 (213/300) Φ ART N V P
300 633 1102 358 366
• φ,N 87 .29 (87/300)
• φ,V 10 .03 (10/300)
• ART,N 633 1
• N,V 358 .32
• N,N 108 .10
• N,P 366 .33
• V,N 134 .37
• V,P 150 .42
• V,ART 194 .54
• P,ART 226 .62
• P,N 140 .38
• V,V 30 .08
BITS Pilani, Pilani Campus
Sample Lexical Generation Probabilities
5/29/2021 54
BITS Pilani, Pilani Campus
Disambiguating to race tomorrow
5/29/2021 55
BITS Pilani, Pilani Campus
Look Up the Probabilities
• P(NN|TO) = .00047
• P(VB|TO) = .83
• P(race|NN) = .00057
• P(race|VB) = .00012
• P(NR|VB) = .0027
• P(NR|NN) = .0012
• P(VB|TO)P(NR|VB)P(race|VB) = .00000027
• P(NN|TO)P(NR|NN)P(race|NN)=.00000000032
• So we (correctly) choose the verb reading
5/29/2021 56
BITS Pilani, Pilani Campus
Example2:Statistical POS tagging-
Whole tag sequence
P( DT JJ NN | a smart dog)
= P(DD JJ NN a smart dog) / P (a smart dog)
= P(DD JJ NN) P(a smart dog | DD JJ NN )
BITS Pilani, Pilani Campus
Tag Transition Probability
initial probability
𝑝(𝑡1 )
• Q&A
• Suggestions / Feedback
1.Consider all possible 3-day weather sequences [H, H, H], [H, H, C], [H, C,
H],
2. For each 3-day weather sequence, consider the probability of the ice
cream Consumption sequences [1,2,1]
3. Pick out the sequence that has the highest probability from step #2.
Not efficient
Viterbi algorithm BITS Pilani, Pilani Campus
Problem3
Find :
The start probabilities
The transition probabilities
Emission probabilities
Forward –backward algorithm
Know this.
Transition probability
Emission probabilities
H C 1 2 3
H 0.7 0.3 H 0.2 0.4 0.4
C 0.4 0.6 C 0.5 0.4 0.1
BITS Pilani, Pilani Campus
Find 131 sequence
P(a,b)=P(a|b) P(b)
P(131|HCH)=P(1|H)P(3|C)P(1|H) P(S|H)P(C|H)P(H|C)
1 2 ------8(N^T=2^3)
H c H H H c
1 3 3 1 3
3
0.14 H 0.28 H
H
0.08
0.15 0.02
0.04
C C C
S
0.3 0.06
0.02 V2=0.048 V3=.00288 V3=Max[(0.048*0.06=.00288),
P(3,C)=P(3/C)*P(C) (0.0448*0.04=0.001792)
3 1 3
= 0.1*0.2
=0.02
P(1,C)=P(1/C)*P(C/H) V2 P(3,C)=P(3/C)*P(C/C)
=0.5*0.3 =Max[(0.15*0. = 0.1*0.6
=0.15 32=0.048),(0.0 =0.06
2*0.3=0.006)]
P(1,C)=P(1/C)*P(C/C) P(3,C)=P(3/C)*P(C/H)
=0.5*0.6 = 0.1*0.4
=0.30 =0.04 BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Example: Rainy and Sunny Days
Your colleague in another city either walks to work or drives every
day and his decision is usually based on the weather
Given daily emails that include whether he has walked or driven to
work, you want to guess the most likely sequence of whether the
days were rainy or sunny
Two hidden states: rainy and sunny
Two observables: walking and driving
Assume equal likelihood of the first day being rainy or sunny
Transitional probabilities
rainy given yesterday was (rainy = .7, sunny = .3)
sunny given yesterday was (rainy = .4, sunny = .6)
Output (emission) probabilities
sunny given walking = .1, driving = .9
rainy given walking = .8, driving = .2
Given that your colleague walked, drove, walked, what is the most
likely sequence of days?
P(walk,sunny)=P(sunny|walk)*P(sunny|sunny) V3=max(0.056*0.24=0.0134,
=0.1*0.6=0.06 0.144*0.06=0.008)
=P(sunny|walk)*P(sunny|raniy) =0.0134
=0.1*0.4=0.04
0.56
rainy 0.14 rainy rainy
0.8*0.5=0.40
0.36
start sunny sunny sunny
0.1*0.5=0.05 0.54
v1=0.05 V2=0.144 V3=0.0134
P(Verb|Det)=0.00001,P(Noun|Det)=0.5,P(Adj|Det)=0.3,
P(Noun|Noun)=0.2,P(Adj|Noun)=0.002,P(Noun|Adj)=0.2,
P(Noun|Verb)=0.3,P(Verb|Noun)=0.3,P(Verb|Adj)=0.001,P(verb|ve
rb)=0.1
Assume that all the tags have the same probabilities at the
beginning of the sentence . Using Viterbi algorithm find out the
best tag sequence.
45
BITS Pilani, Pilani Campus
Forward-Backward
Baum-Welch = Forward-Backward Algorithm (Baum
1972)
Is a special case of the EM or Expectation-Maximization
algorithm
The algorithm will let us learn the transition probabilities
A= {aij} and the emission probabilities B={bi(ot)} of the
HMM
39
BITS Pilani, Pilani Campus
Bidirectionality
• Unknown words
• First order HMM
Eg:
Is clearly marked
He clearly marked
• In HMM
• In MEMM
Formula
Session 5-Parsing
Date – 9 May 2021
Time – 9to 11am
These slides are prepared by the instructor, with grateful acknowledgement of James
Allen and many others who made their course materialsfreely availableonline.
POS tagging
Approaches used in POS tagging
HMM
HMM tagger
Viterbi algorithm
Forward algorithm
Maximum entropy model
Session5:Parsing
• Grammars and Sentence Structure
• What Makes a Good Grammar
• Parsing
• A Top-Down Parser
• A Bottom-Up Chart Parser
• Top-Down Chart Parsing
• Finite State Models and Morphological Parsing.
Sentiment Analysis:
"I like Frozen"
"I do not like Frozen“
"I like frozen yogurt"
Relation Extraction:
"Rome is the capital of Italy and the region of Lazio".
• Not only are there a large number of words, but each word may
combine with affixes to produce additional related words.
• One way to address this problem is to preprocess the input
sentence into a sequence of morphemes.
• A word may consist of single morpheme, but often a word consists
of a root form plus an affix. .
Morphological analysis
CSC 9010- NLP - 3: Morphology, Finite
State Transducers
BITS Pilani, Pilani Campus
Stemming in IR
• over-stemming
• under-stemming
Porter
No lexicon needed
Basically a set of staged sets of rewrite
rules that strip suffixes
Handles both inflectional and
derivational suffixes
Doesn’t guarantee that the resulting stem
is really a stem (see first bullet)
Lack of guarantee doesn’t matter for IR
• CKY Parsing
• Probabilistic Context-Free Grammars
• PCFG for disambiguation
• Probabilistic CKY Parsing of PCFGs
• Ways to Learn PCFG Rule Probabilities
• Probabilistic Lexicalized CFGs
• Evaluating Parsers
• Problems with PCFGs
BITS Pilani, Pilani Campus
Parsing algorithm
– Solution:
• Introduce a dummy non-terminal to
cover the original terminal
– E.g. Det the
• Re-write the original rule: 9
– Solution:
• Find all rules that have the form Nominal ...
– Nominal Noun PP
– Nominal Det Noun
• Re-write the above rule several times to eliminate
the intermediate non-terminal:
– NP Noun PP
– NP Det Noun
10
– Note that this makes our grammar “flatter”
BITS Pilani, Pilani Campus
Converting a CFG to CNF
Solution:
– Introduce new non-terminals to spread the sequence on the
RHS over more than 1 rule.
• Nominal Noun PP
• NP Det Nominal
11
13
1 2 3 4 5
0 Det S
1
2
3
4
1 2 3 4 5
0 Det S
1 N
2
3
4
1 2 3 4 5
0 Det NP S
1 N
2
3
4
Lexical lookup
• Matches Det the
The flight includes a meal.
Lexical lookup
• Matches N flight The flight includes a meal.
Syntactic lookup:
• look backwards and see if there
is any rule that will cover what
we’ve done so far. The flight includes a meal.
Lexical lookup
• Matches V includes
The flight includes a meal.
Syntactic lookup
• There are no rules in our
grammar that will cover
Det, NP, V
The flight includes a meal.
2 V
3 Det
4
5
Lexical lookup
• Matches Det a
The flight includes a meal.
2 V
3 Det
4 N
Lexical lookup
• Matches N meal The flight includes a meal.
2 V
3 Det NP
4 N
Syntactic lookup
• We find that we have
NP Det N
The flight includes a meal.
2 V VP
3 Det NP
4 N
Syntactic lookup
• We find that we have
VP V NP
2 V VP
3 Det NP
4 N
Syntactic lookup
• We find that we have
S NP VP
The flight includes a meal.
27
28
0 Det NP S
1 N
2 V VP
3 Det NP
4 N
0 Det NP S
1 N
2 V VP
3 Det NP
4 N
NB: This algorithm always fills the top “triangle” of the table!
30
32
33
35
36
Fall 2001 EE669: Natural Language Processing 36
BITS Pilani, Pilani Campus
Example: Probability of a Derivation
Tree
43
An example tree:
T OP
NP VP
N N P N N PS VB D NP PP
NP PP AD VP IN NP
C D N N IN NP RB NP PP
QP PR P$ JJ N N C C JJ N N N N S IN NP
$ C D C D PU N C , NP SB AR
N N P PU N C , W H AD VP S
W RB NP VP
DT NN VB Z NP
QP N N S PU N C .
RB CD
C an ad ian Utilities h ad 1988 r ev en u e of C $ 1.16 billion , m ain ly f r o m its natural g as an d elect r ic utility b u sin essesin Albert a , wh er e the co m p an y ser v es ab o u t 800,000 cu st o m er s .
46
P
(t
he|A RT) .5
4 P
(a
|A RT) .3
60
P
(f
lies|N
) .0
25 P
(a
|N ) .0
01
P
(f
lies|V
) .0
76 P
(f
lowe r
|N) .0
63
P
(l
ike|V) .1 P
(f
lowe r
|V) .0
5
P
(l
ike|P) .0
68 P
(bi
rds|N) .0
76
P
(l
ike|N) .0
12
49
EE669: Natural Language Processing 49
BITS Pilani, Pilani Campus
The PCFG
• Below is a probabilistic CFG (PCFG) with
probabilities derived from analyzing a
parsed version of Allen's corpus.
Rule Count for Coun t for PROB
LH S Rule
1. SNPVP 300 300 1
2. VPV 300 116 .386
3. VPVNP 300 118 .393
4. VPVNPP P 300 66 .22
5. NPNPPP 1032 241 .23
6. NPNN 1032 92 .09
7. NPN 1032 141 .14
8. NPARTN 1032 558 .54
9. PPPNP 307 307 1
50
Fall 2001 EE669: Natural Language Processing
BITS Pilani, Pilani Campus
Parsing with a PCFG
• Using the lexical probabilities, we can
derive probabilities that the constituent NP
generates a sequence like a flower. Two
rules could generate the string of words:
NP NP
8 6
ART N N N
a flower a flower
51
Fall 2001 EE669: Natural Language Processing
BITS Pilani, Pilani Campus
Three Possible Trees for an S
S S
1 1
NP VP NP VP
2 7 3
8
N V NP
ART N V 7
a N
a flowerwilted S flower
1 wilted
NP VP2
6
N N V
a flowerwilted
52
Fall 2001 EE669: Natural Language Processing
BITS Pilani, Pilani Campus
Parsing with a PCFG
• The probability of a sentence generating A flower
wilted:
P(a flower wilted|S) = P(R1|S) × P(a flower|NP) ×
P(wilted|VP) + P(R1|S) × P(a|NP) × P(flower wilted|VP)
Using this approach, the probability that a given
sentence will be generated by the grammar can
be efficiently computed.
• It only requires some way of recording the value
of each constituent between each two possible
positions. The requirement can be filled by a
packed chart structure.
53
Fall 2001 EE669: Natural Language Processing
BITS Pilani, Pilani Campus
Uses of probabilities in parsing
Disambiguation: given n legal parses of a string, which is the most
likely?
– e.g. PP-attachment ambiguity can be resolved this way
54
Notation:
– G = a PCFG
– s = a sentence
– t = a particular tree under our grammar
• t consists of several nodes n
• each node is generated by applying some rule r
55
Note that:
– A tree can have multiple derivations
• (different sequences of rule applications could give rise
to the same tree)
– But the probability of the tree remains the same
• (it’s the same probabilities being multiplied)
– We usually speak as if a tree has only one derivation, called the canonical derivation
56
BITS Pilani, Pilani Campus
Picking the best parse in a PCFG
A sentence will usually have several parses
– we usually want them ranked, or only want the n best
parses
– we need to focus on P(t|s,G)
• probability of a parse, given our sentence and our
grammar
57
BITS Pilani, Pilani Campus
Probability of a sentence
Given a probabilistic context-free grammar G, we can the
probability of a sentence (as opposed to a tree).
Observe that:
– As far as our grammar is concerned, a sentence is only a
sentence if it can be recognised by the grammar (it is “legal”)
– There can be multiple parse trees for a sentence.
• Many trees whose yield is the sentence
– The probability of the sentence is the sum of all the probabilities
of the various trees that yield the sentence.
58
59
S NP VP [.80]
NP Det N [.30]
VP V NP [.20]
V includes [.05]
Det the [.4]
Det a [.4]
N meal [.01]
N flight [.02]
60
S NP VP [.80]
NP Det N [.30]
VP V NP [.20]
V includes [.05] The flight includes a meal.
Det the [.4]
Det a [.4]
N meal [.01]
N flight [.02]
61
S NP VP [.80]
NP Det N [.30]
VP V NP [.20]
V includes [.05] The flight includes a meal.
Det the [.4]
Det a [.4]
N meal [.01]
N flight [.02] 62
S NP VP [.80]
NP Det N [.30]
VP V NP [.20]
V includes [.05] The flight includes a meal.
Det the [.4]
Det a [.4]
63
N meal [.01]
N flight [.02]
BITS Pilani, Pilani Campus
Probabilistic CYK: syntactic step
1 2 3 4 5
0 Det NP
(.4) .0024
1 N
.02
2
3
4
5
S NP VP [.80]
NP Det N [.30]
VP V NP [.20]
The flight includes a meal.
V includes [.05]
Det the [.4]
Det a [.4] Note: probability of NP in [0,2]
N meal [.01] P(Det the) * P(N flight) * P(NP Det N)
N flight [.02] 64
N meal [.01] 4
5
N flight [.02]
N flight [.02]
The flight includes a meal.
S NP VP [.80]
0 Det NP
(.4) .0024
NP Det N 1 N
[.30] .02
VP V NP 2 V
[.20] .05
V includes 3 Det NP
[.05] .4 .001
4
Det the [.4] N
.01
Det a [.4]
N meal [.01]
The flight includes a meal.
N flight [.02]
S NP VP [.80] 0 Det NP S
(.4) .0024 .0000000192
NP Det N 1 N
[.30] .02
VP V NP [.20] 2 V VP
V includes .05 .00001
[.05] 3 Det NP
71
72
Example:
– P(NP Pro) is independent of where the NP is in the
sentence
– but we know that NPPro is much more likely in
subject position
– Francis et al (1999) using the Switchboard corpus:
• 91% of subjects are pronouns;
• only 34% of objects are pronouns
73
74
BITS Pilani, Pilani Campus
Lexicalised PCFGs
Attempt to weaken the lexical independence
assumption.
75
BITS Pilani, Pilani Campus
Lexicalised PCFGs: Matt walks
Makes probabilities partly dependent on lexical
content. S(walks)
P(VPVBD|VP) becomes:
P(VPVBD|VP,h(VP)=walks) NP(Matt) VP(walks)
Matt walks
• https://www.youtube.com/watch?v=Z6GsoBA-
09k&list=PLQiyVNMpDLKnZYBTUOlSI9mi9wAErFtFm&i
ndex=62
• https://lost-
contact.mit.edu/afs/cs.pitt.edu/projects/nltk/docs/tutorial/
pcfg/nochunks.html
• http://www.nltk.org/howto/grammar.html
Q&A
Suggestions / Feedback
• CKY parsing
• PCFG parsing
• Problems with PCFG
• Lexicalized PCFGS
3
BITS Pilani, Pilani Campus
Outline
Motivation
Two types of parsing
Dependency parsing
Phrase structure parsing
Dependency structure and Dependency grammar
Dependency Relation
Universal Dependencies
Method of Dependency Parsing
Dynamic programming
Graph algorithms
Constraint satisfaction
Deterministic Parsing
Transition based dependency parsing
Graph based dependency Parsing
Evaluation 4
BITS Pilani, Pilani Campus
Interpreting Language is Hard!
I saw a girl with a telescope
2
5
BITS Pilani, Pilani Campus
Two Types of Parsing
● Dependency: focuses on relations between words
4
7
BITS Pilani, Pilani Campus
Dependency Grammar and
Dependency Structure
Bills were by
prep pobj
on Brownback
pobj nn appos
ports Senator Republican
cc conj prep
and immigration of
pobj
9 Kansas
BITS Pilani, Pilani Campus
Relation between phrase structure and
dependency structure
10
BITS Pilani, Pilani Campus
Dependency graph
11
BITS Pilani, Pilani Campus
Formal conditions on dependency graph
12
BITS Pilani, Pilani Campus
Universal dependencies
http://universaldependencies.org/
• Annotated treebanks in many languages
• Uniform annotation scheme across all languages:
• Universal POS tags
• Universal dependency relations
13
BITS Pilani, Pilani Campus
Dependency Relations
14
BITS Pilani, Pilani Campus
Example Dependency
Parse
15
BITS Pilani, Pilani Campus
Method of Dependency Parsing
16
BITS Pilani, Pilani Campus
Methods of Dependency Parsing
17
BITS Pilani, Pilani Campus
Deterministic parsing
19
BITS Pilani, Pilani Campus
Transition based systems for
Dependency parsing
20
BITS Pilani, Pilani Campus
Arc eager parsing(Malt parser)
21
BITS Pilani, Pilani Campus
Example1
22
BITS Pilani, Pilani Campus
Example2
23
BITS Pilani, Pilani Campus
Creating an Oracle
24
BITS Pilani, Pilani Campus
How the classifier the learns ?
25
BITS Pilani, Pilani Campus
Feature Models
Feature template:
26
BITS Pilani, Pilani Campus
Feature examples
27
BITS Pilani, Pilani Campus
Classifier at runtime
28
BITS Pilani, Pilani Campus
Training Data
29
BITS Pilani, Pilani Campus
Generating training data example
30
BITS Pilani, Pilani Campus
Standard Oracle for Arc Eager
parsing
31
BITS Pilani, Pilani Campus
Online learning with an oracle
32
BITS Pilani, Pilani Campus
Example
33
BITS Pilani, Pilani Campus
34
BITS Pilani, Pilani Campus
Graph-based
parsing
35
BITS Pilani, Pilani Campus
Graph concepts
refresher
36
BITS Pilani, Pilani Campus
Multi Digraph
37
BITS Pilani, Pilani Campus
Directed
Spanning Trees
38
BITS Pilani, Pilani Campus
Weighted Spanning tree
39
BITS Pilani, Pilani Campus
MST
40
BITS Pilani, Pilani Campus
Finding MST
41
BITS Pilani, Pilani Campus
Chu-Liu-Edmonds algorithm
42
BITS Pilani, Pilani Campus
Chu-Liu-Edmonds Algorithm (2/12)
9
Gx 10
30
roo
t9 0 Mary
20 saw
30
11
John
3
43
/39
BITS Pilani, Pilani Campus
Chu-Liu-Edmonds Example
50
BITS Pilani, Pilani Campus
Arch features
51
BITS Pilani, Pilani Campus
52
BITS Pilani, Pilani Campus
53
BITS Pilani, Pilani Campus
54
BITS Pilani, Pilani Campus
Learning the parameters
55
BITS Pilani, Pilani Campus
Inference based learning
56
BITS Pilani, Pilani Campus
57
BITS Pilani, Pilani Campus
EXAMPLE
58
BITS Pilani, Pilani Campus
Evaluation
Gold Parsed
1 2 She nsubj 1 2 She nsubj
2 0 saw root 2 0 saw root
3 5 the det 3 4 the det
4 5 video nn 4 5 video nsubj
5 2 lecture dobj 5 2 lecture ccomp
59
BITS Pilani, Pilani Campus
References
60
BITS Pilani, Pilani Campus
Any Questions?
61
BITS Pilani, Pilani Campus
Thank you
62
BITS Pilani, Pilani Campus