User profiles for Sneha Reddy Kudugunta
Sneha KuduguntaGoogle DeepMind Verified email at google.com Cited by 3322 |
Investigating multilingual NMT representations at scale
Multilingual Neural Machine Translation (NMT) models have yielded large empirical
success in transfer learning settings. However, these black-box representations are poorly …
success in transfer learning settings. However, these black-box representations are poorly …
Beyond distillation: Task-level mixture-of-experts for efficient inference
Sparse Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …
translation models to billions of parameters without a proportional increase in training …
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sparse Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …
translation models to billions of parameters without a proportional increase in training …
Investigating Multilingual NMT Representations at Scale
Multilingual Neural Machine Translation (NMT) models have yielded large empirical
success in transfer learning settings. However, these black-box representations are poorly …
success in transfer learning settings. However, these black-box representations are poorly …
MiTTenS: A Dataset for Evaluating Gender Mistranslation
K Robinson, S Kudugunta, R Stella… - Proceedings of the …, 2024 - aclanthology.org
Translation systems, including foundation models capable of translation, can produce errors
that result in gender mistranslation, and such errors can be especially harmful. To measure …
that result in gender mistranslation, and such errors can be especially harmful. To measure …
Exploring routing strategies for multilingual mixture-of-experts models
Sparsely-Gated Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …
translation models to billions of parameters without a proportional increase in training …
MiTTenS: A Dataset for Evaluating Gender Mistranslation
Translation systems, including foundation models capable of translation, can produce errors
that result in gender mistranslation, and such errors can be especially harmful. To measure …
that result in gender mistranslation, and such errors can be especially harmful. To measure …
Buffet: Benchmarking large language models for few-shot cross-lingual transfer
Despite remarkable advancements in few-shot generalization in natural language processing,
most models are developed and evaluated primarily in English. To facilitate research on …
most models are developed and evaluated primarily in English. To facilitate research on …
Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models
Massively multilingual models subsuming tens or even hundreds of languages pose great
challenges to multi-task optimization. While it is a common practice to apply a language-…
challenges to multi-task optimization. While it is a common practice to apply a language-…
BERT is not an interlingua and the bias of tokenization
Multilingual transfer learning can benefit both high-and low-resource languages, but the
source of these improvements is not well understood. Cananical Correlation Analysis (CCA) of …
source of these improvements is not well understood. Cananical Correlation Analysis (CCA) of …