User profiles for Sneha Reddy Kudugunta

Sneha Kudugunta

Google DeepMind
Verified email at google.com
Cited by 3322

Investigating multilingual NMT representations at scale

SR Kudugunta, A Bapna, I Caswell… - arXiv preprint arXiv …, 2019 - arxiv.org
Multilingual Neural Machine Translation (NMT) models have yielded large empirical
success in transfer learning settings. However, these black-box representations are poorly …

Beyond distillation: Task-level mixture-of-experts for efficient inference

S Kudugunta, Y Huang, A Bapna, M Krikun… - arXiv preprint arXiv …, 2021 - arxiv.org
Sparse Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …

Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference

…, DD Lepikhin, M Krikun, O Firat, SR Kudugunta… - 2021 - research.google
Sparse Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …

Investigating Multilingual NMT Representations at Scale

S Reddy Kudugunta, A Bapna, I Caswell… - arXiv e …, 2019 - ui.adsabs.harvard.edu
Multilingual Neural Machine Translation (NMT) models have yielded large empirical
success in transfer learning settings. However, these black-box representations are poorly …

MiTTenS: A Dataset for Evaluating Gender Mistranslation

K Robinson, S Kudugunta, R Stella… - Proceedings of the …, 2024 - aclanthology.org
Translation systems, including foundation models capable of translation, can produce errors
that result in gender mistranslation, and such errors can be especially harmful. To measure …

Exploring routing strategies for multilingual mixture-of-experts models

S Kudugunta, Y Huang, A Bapna, M Krikun, D Lepikhin… - 2021 - openreview.net
Sparsely-Gated Mixture-of-Experts (MoE) has been a successful approach for scaling multilingual
translation models to billions of parameters without a proportional increase in training …

MiTTenS: A Dataset for Evaluating Gender Mistranslation

K Robinson, S Kudugunta, R Stella, S Dev… - arXiv preprint arXiv …, 2024 - arxiv.org
Translation systems, including foundation models capable of translation, can produce errors
that result in gender mistranslation, and such errors can be especially harmful. To measure …

Buffet: Benchmarking large language models for few-shot cross-lingual transfer

A Asai, S Kudugunta, XV Yu, T Blevins… - arXiv preprint arXiv …, 2023 - arxiv.org
Despite remarkable advancements in few-shot generalization in natural language processing,
most models are developed and evaluated primarily in English. To facilitate research on …

Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models

Z Wang, Y Tsvetkov, O Firat, Y Cao - arXiv preprint arXiv:2010.05874, 2020 - arxiv.org
Massively multilingual models subsuming tens or even hundreds of languages pose great
challenges to multi-task optimization. While it is a common practice to apply a language-…

BERT is not an interlingua and the bias of tokenization

J Singh, B McCann, R Socher… - Proceedings of the 2nd …, 2019 - aclanthology.org
Multilingual transfer learning can benefit both high-and low-resource languages, but the
source of these improvements is not well understood. Cananical Correlation Analysis (CCA) of …