Any-Language Frame-Semantic Parsing
Any-Language Frame-Semantic Parsing
Any-Language Frame-Semantic Parsing
2062
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2062–2066,
c
Lisbon, Portugal, 17-21 September 2015.
2015 Association for Computational Linguistics.
Figure 1: Frame semantic annotation from the German Wikipedia data (Women’s Rights)
2063
when tagged with one of the top k part-of-speech 3.3 Argument identification
tags most commonly seen as targets in the train-
A frame contains a number of named arguments
ing set. The k parameter is optimized to maxi-
that may or may not be expressed in a given sen-
mize F1 on our development language, Spanish,
tence. Argument identification is concerned with
where we found k = 7.3 Surviving candidates are
assigning frame arguments to spans of words in
then translated into English by mapping the words
the sentence. While this task can benefit from in-
into multi-lingual BABELNET synsets, which rep-
formation on the joint assignment of arguments,
resent sets of words with similar meaning across
Das et al. (2014) report only an improvement of
languages. All English words in the BABEL -
less than 1% in F1 using beam search to approxi-
NET synsets are considered possible translations.
mate a global optimal configuration for argument
If any of the translations are potential targets in
identification. To simplify our system, we take all
F RAME N ET 1.5, the current word is identified as
argument-identification decisions independently.
a frame-evoking word.
We use a single classifier for argument identifica-
tion, computing the most probable argument for
3.2 Frame identification each frame element. Each word index is associ-
ated with a span by the transitive closure of its syn-
A target word is, on average, ambiguous be- tactic dependencies (i.e. subtree). Our greedy ap-
tween three frames. We use a multinomial log- proach to argument identification thus amounts to
linear classifier4 (with default parameters) to de- scoring the n + 1 possible realisations of an argu-
cide which of the possible frames evoked by the ment for an n-length sentence (i.e. subtrees plus
target word that fits the context best. Our feature the empty argument), selecting the highest scor-
representation replicates that of Das et al. (2014) ing subtree for each argument type allowed by the
as far as possible, considering the multilingual set- frame.
ting where lexical features cannot be directly used. As the training data contains very few examples
To compensate for the lack of lexical features, of each frame or role (e.g., Buyer in the frame
we introduce two groups of language-independent C OMMERCE _ SCENARIO), we enable sharing of
features that rely on multilingual word embed- features for frame arguments that have the same
dings. One feature group uses the embedding of name. The assumption is that arguments with
the target word directly, while the other is based on identical names have similar semantic properties
distance measures between the target word and the across frames; that is the argument Perpetrator,
set of English words used as targets for a possible for example, is similar for the frames A RSON and
frame. We measure the minimum and mean dis- T HEFT.
tance (in embedding space) from the target word The scores are the confidences of a binary clas-
to the set of English target words, as well as the sifier trained on <frame, argument, subtree> tu-
distances to each word individually. ples. Positive examples are the observed argu-
Several of the features in the original repre- ments. We use the remaining n incorrect subtrees
sentation are built on top of automatic POS an- for a given <frame, argument> pair to generate
notation and syntactic parses. We use the Uni- negative training examples . A single binary clas-
versal Dependencies v1.1 treebanks for the lan- sification model is trained for the whole data set.
guages in our data to train part-of-speech taggers As with frame identification, our features are
(T REE TAGGER5 ) and a dependency parser (T UR - similar to those of Das et al. (2014), with a few
BO PARSER 6 ) to generate the syntactic features. In exceptions and additions. We use dependency sub-
contrast to Das et al. (2014), we use dependency trees instead of spans and replace all lexical fea-
subtrees instead of spans. tures (which do not transfer cross-lingually) with
features based on the interlingual word embed-
3
The white-listed POS are nouns, verbs, adjectives,
dings from Søgaard et al. (2015a). We use the
proper nouns, adverbs, and determiners. embeddings to find the 20 most similar words in
4
http://hunch.net/~vw/ the training data and use these words to generate
5
http://www.cis.uni-muenchen.de/ lexical features that matched the source-language
~schmid/tools/TreeTagger/
6
http://www.cs.cmu.edu/~ark/ training data. Each feature is weighted by its co-
TurboParser/ sine similarity with the target-language word.
2064
Target identification BG DA DE EL EN ES FR IT SV Avg.
S YSTEM 85.5 73.6 58.4 52.9 80.2 89.1 66.1 69.0 72.8 72.0
F1
BASELINE 44.0 56.8 27.2 46.1 78.8 45.9 42.8 47.7 41.4 47.9
S YSTEM 89.2 70.9 66.2 36.4 96.3 84.9 51.8 53.4 63.4 67.0
Precision
BASELINE 56.8 65.0 48.7 43.2 88.0 75.2 55.0 55.3 47.3 59.4
Baseline Our approach to multi-lingual frame sented in Table 2. Our system is better in six out of
semantics parsing extends Das et al. (2014) to nine cases, whereas the most frequent sense base-
cross-lingual learning using the interlingual em- line is best in two. It is unsurprising that English
beddings from Søgaard et al. (2015a). Our base- fares best in this setup, because it does not undergo
line is a more direct application of the S EMAFOR the word-to-word translation of the other data sets.
system7 (Das et al., 2014), translating target lan- Argument identification is a harder task, and
guage text to English using word-to-word transla- scores are generally lower; see the lower part of
tions and projecting annotation back. For word- Table 2. Also, note that errors percolate: If we do
to-word translation we use Wiktionary bilingual not identify a target, or mislabel a frame, we can
dictionaries (Ács, 2014), and we use frequency no longer retrieve the correct arguments. Never-
counts from U K WAC8 to disambiguate words with theless, we observe that we are better than running
multiple translations, preferring the most common S EMAFOR on word-by-word translations in eight
one. The baseline and our system both use the out of nine languages—all, except English.
training data supplied with F RAME N ET for learn- Generally, we obtain error reductions over our
ing. baseline of 46% for target identification, 37% for
frame identification, and 14% for argument iden-
4 Results tification. For English, we are only 2% (absolute)
below IAA for target identification, but about 40%
Consider first the target identification results in
below IAA for frame and argument identification.
Table 2. We observe that using BABEL N ET and
For Danish, the gap is smaller.
our re-implementation of Das et al. (2014) per-
If we compare performance on Wikipedia and
forms considerably better than running S EMAFOR
Twitter datasets, we see that target identifica-
on Wiktionary word-by-word translations.
tion and frame identification scores are gener-
Our frame identification results are also pre- ally higher for Wikipedia, while argument iden-
7
http://www.ark.cs.cmu.edu/SEMAFOR/ tification scores are higher for Twitter. While
8
http://wacky.sslmit.unibo.it/ Wikipedia is generally more similar to the
2065
newswire/balanced corpus in F RAME N ET 1.5, the
sentence length is shorter in tweets, making it eas-
ier to identify the correct arguments.
5 Conclusions
We presented a multi-lingual frame-annotated cor-
pus covering nine languages in two domains.
With this corpus we performed experiments to
predict target, frame and argument identification,
outperforming a word-to-word translated baseline
running on SEMAFOR. Our approach is a de-
lexicalized version of Das et al. (2014) with a sim-
pler decoding strategy and, crucially, using multi-
lingual word embeddings to achieve any-language
frame-semantic parsing. Over a baseline of using
S EMAFOR with word-to-word translations, we ob-
tain error reductions of 46% for target identifica-
tion, 37% for frame identification, and 14% for ar-
gument identification.
References
Judit Ács. 2014. Pivot-based multilingual dictionary
building using wiktionary. In LREC.
Dipanjan Das, Desai Chen, Andre Martins, Nathan
Schneider, and Noah Smith. 2014. Frame-semantic
parsing. Computational linguistics, 40(1):9–56.
Karl Moritz Hermann, Dipanjan Das, Jason Weston,
and Kuzman Ganchev. 2014. Semantic frame iden-
tification with distributed word representations. In
ACL.
Richard Johansson and Pierre Nugues. 2007. Ex-
tended constituent-to-dependency conversion for
English. In NODALIDA.
Dan Shen and Mirella Lapata. 2007. Using semantic
roles to improve question answering. In EMNLP.
2066