User profiles for Veselin Stoyanov
Veselin StoyanovTome AI Verified email at fb.com Cited by 57096 |
Roberta: A robustly optimized bert pretraining approach
Language model pretraining has led to significant performance gains but careful comparison
between different approaches is challenging. Training is computationally expensive, often …
between different approaches is challenging. Training is computationally expensive, often …
Unsupervised cross-lingual representation learning at scale
This paper shows that pretraining multilingual language models at scale leads to significant
performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-…
performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-…
SemEval-2016 task 4: Sentiment analysis in Twitter
This paper discusses the fourth year of the ``Sentiment Analysis in Twitter Task''. SemEval-2016
Task 4 comprises five subtasks, three of which represent a significant departure from …
Task 4 comprises five subtasks, three of which represent a significant departure from …
XNLI: Evaluating cross-lingual sentence representations
State-of-the-art natural language processing systems rely on supervision in the form of
annotated data to learn competent models. These models are generally trained on data in a …
annotated data to learn competent models. These models are generally trained on data in a …
Lever: Learning to verify language-to-code generation with execution
The advent of large language models trained on code (code LLMs) has led to significant
progress in language-to-code generation. State-of-the-art approaches in this area combine LLM …
progress in language-to-code generation. State-of-the-art approaches in this area combine LLM …
Pretrained language models for biomedical and clinical tasks: understanding and extending the state-of-the-art
A large array of pretrained models are available to the biomedical NLP (BioNLP) community.
Finding the best model for a particular task can be difficult and time-consuming. For many …
Finding the best model for a particular task can be difficult and time-consuming. For many …
Emerging cross-lingual structure in pretrained language models
We study the problem of multilingual masked language modeling, ie the training of a single
model on concatenated text from multiple languages, and present a detailed study of several …
model on concatenated text from multiple languages, and present a detailed study of several …
[PDF][PDF] Conundrums in noun phrase coreference resolution: Making sense of the state-of-the-art
V Stoyanov, N Gilbert, C Cardie… - … of the ACL and the 4th …, 2009 - aclanthology.org
We aim to shed light on the state-of-the-art in NP coreference resolution by teasing apart the
differences in the MUC and ACE task definitions, the assumptions made in evaluation …
differences in the MUC and ACE task definitions, the assumptions made in evaluation …
Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure
V Stoyanov, A Ropson, J Eisner - Proceedings of the …, 2011 - proceedings.mlr.press
Graphical models are often used “inappropriately,” with approximations in the topology,
inference, and prediction. Yet it is still common to train their parameters to approximately …
inference, and prediction. Yet it is still common to train their parameters to approximately …
[PDF][PDF] Topic identification for fine-grained opinion analysis
V Stoyanov, C Cardie - … of the 22nd International Conference on …, 2008 - aclanthology.org
Within the area of general-purpose finegrained subjectivity analysis, opinion topic identification
has, to date, received little attention due to both the difficulty of the task and the lack of …
has, to date, received little attention due to both the difficulty of the task and the lack of …