default search action
Nanxin Chen
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i27]Zhong Meng, Zelin Wu, Rohit Prabhavalkar, Cal Peyser, Weiran Wang, Nanxin Chen, Tara N. Sainath, Bhuvana Ramabhadran:
Text Injection for Neural Contextual Biasing. CoRR abs/2406.02921 (2024) - [i26]Xuan Kan, Yonghui Xiao, Tien-Ju Yang, Nanxin Chen, Rajiv Mathews:
Parameter-Efficient Transfer Learning under Federated Learning for Automatic Speech Recognition. CoRR abs/2408.11873 (2024) - 2023
- [c34]Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-Based Text To Speech. ASRU 2023: 1-8 - [c33]Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the Thin Gap Between Speech and Text Foundation Models. ASRU 2023: 1-8 - [c32]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. ICASSP 2023: 1-5 - [c31]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. ICASSP 2023: 1-5 - [c30]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023: 456-460 - [i25]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. CoRR abs/2301.07851 (2023) - [i24]Qingqing Huang, Daniel S. Park, Tao Wang, Timo I. Denk, Andy Ly, Nanxin Chen, Zhengdong Zhang, Zhishuai Zhang, Jiahui Yu, Christian Havnø Frank, Jesse H. Engel, Quoc V. Le, William Chan, Wei Han:
Noise2Music: Text-conditioned Music Generation with Diffusion Models. CoRR abs/2302.03917 (2023) - [i23]Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023) - [i22]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? CoRR abs/2306.01015 (2023) - [i21]Nanxin Chen, Izhak Shafran, Yu Zhang, Chung-Cheng Chiu, Hagen Soltau, James Qin, Yonghui Wu:
Efficient Adapters for Giant Speech Models. CoRR abs/2306.08131 (2023) - [i20]Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the thin gap between speech and text foundation models. CoRR abs/2310.00230 (2023) - [i19]Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-based Text to Speech. CoRR abs/2311.00945 (2023) - 2022
- [c29]Yuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel Bacchiani:
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping. INTERSPEECH 2022: 803-807 - [c28]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75 - [i18]Yuma Koizumi, Heiga Zen, Kohei Yatabe, Nanxin Chen, Michiel Bacchiani:
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping. CoRR abs/2203.16749 (2022) - [i17]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022) - [i16]Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding:
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation. CoRR abs/2210.15868 (2022) - [i15]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. CoRR abs/2211.01263 (2022) - 2021
- [j6]Nanxin Chen, Shinji Watanabe, Jesús Villalba, Piotr Zelasko, Najim Dehak:
Non-Autoregressive Transformer for Speech Recognition. IEEE Signal Process. Lett. 28: 121-125 (2021) - [c27]Yosuke Higuchi, Nanxin Chen, Yuya Fujita, Hirofumi Inaguma, Tatsuya Komatsu, Jaesong Lee, Jumon Nozaki, Tianzi Wang, Shinji Watanabe:
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation. ASRU 2021: 47-54 - [c26]Nanxin Chen, Piotr Zelasko, Jesús Villalba, Najim Dehak:
Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer. ICASSP 2021: 5994-5998 - [c25]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. ICLR 2021 - [c24]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Interspeech 2021: 3765-3769 - [c23]Nanxin Chen, Piotr Zelasko, Laureano Moro-Velázquez, Jesús Villalba, Najim Dehak:
Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition. Interspeech 2021: 3770-3774 - [i14]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. CoRR abs/2106.09660 (2021) - [i13]Yosuke Higuchi, Nanxin Chen, Yuya Fujita, Hirofumi Inaguma, Tatsuya Komatsu, Jaesong Lee, Jumon Nozaki, Tianzi Wang, Shinji Watanabe:
A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation. CoRR abs/2110.05249 (2021) - 2020
- [j5]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Leibny Paola García-Perera, Fred Richardson, Réda Dehak, Pedro A. Torres-Carrasquillo, Najim Dehak:
State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations. Comput. Speech Lang. 60 (2020) - [c22]Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi:
Zero-Shot Multi-Speaker Text-To-Speech with State-Of-The-Art Neural Speaker Embeddings. ICASSP 2020: 6184-6188 - [c21]Raghavendra Pappagari, Tianzi Wang, Jesús Villalba, Nanxin Chen, Najim Dehak:
X-Vectors Meet Emotions: A Study On Dependencies Between Emotion and Speaker Recognition. ICASSP 2020: 7169-7173 - [c20]Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Nanxin Chen, L. Paola García-Perera, Najim Dehak:
Feature Enhancement with Deep Feature Losses for Speaker Verification. ICASSP 2020: 7584-7588 - [c19]Andrew Titus, Jan Silovský, Nanxin Chen, Roger Hsiao, Mary Young, Arnab Ghoshal:
Improving Language Identification for Multilingual Speakers. ICASSP 2020: 8284-8288 - [c18]Adrian Lancucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J. G. A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent:
Robust Training of Vector Quantized Bottleneck Models. IJCNN 2020: 1-7 - [c17]Yosuke Higuchi, Shinji Watanabe, Nanxin Chen, Tetsuji Ogawa, Tetsunori Kobayashi:
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict. INTERSPEECH 2020: 3655-3659 - [c16]Jesús Antonio Villalba López, Daniel Garcia-Romero, Nanxin Chen, Gregory Sell, Jonas Borgstrom, Alan McCree, Leibny Paola García-Perera, Saurabh Kataria, Phani Sankar Nidadavolu, Pedro Torres-Carrasquiilo, Najim Dehak:
Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19. Odyssey 2020: 273-280 - [i12]Andrew Titus, Jan Silovský, Nanxin Chen, Roger Hsiao, Mary Young, Arnab Ghoshal:
Improving Language Identification for Multilingual Speakers. CoRR abs/2001.11019 (2020) - [i11]Raghavendra Pappagari, Tianzi Wang, Jesús Villalba, Nanxin Chen, Najim Dehak:
x-vectors meet emotions: A study on dependencies between emotion and speaker recognition. CoRR abs/2002.05039 (2020) - [i10]Adrian Lancucki, Jan Chorowski, Guillaume Sanchez, Ricard Marxer, Nanxin Chen, Hans J. G. A. Dolfing, Sameer Khurana, Tanel Alumäe, Antoine Laurent:
Robust Training of Vector Quantized Bottleneck Models. CoRR abs/2005.08520 (2020) - [i9]Yosuke Higuchi, Shinji Watanabe, Nanxin Chen, Tetsuji Ogawa, Tetsunori Kobayashi:
Mask CTC: Non-Autoregressive End-to-End ASR with CTC and Mask Predict. CoRR abs/2005.08700 (2020) - [i8]Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Kai Yu:
End-to-end spoofing detection with raw waveform CLDNNs. CoRR abs/2007.13060 (2020) - [i7]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. CoRR abs/2009.00713 (2020) - [i6]Nanxin Chen, Piotr Zelasko, Jesús Villalba, Najim Dehak:
Focus on the present: a regularization method for the ASR source-target attention layer. CoRR abs/2011.01210 (2020)
2010 – 2019
- 2019
- [c15]Shigeki Karita, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto:
A Comparative Study on Transformer vs RNN in Speech Applications. ASRU 2019: 449-456 - [c14]Cheng-I Lai, Nanxin Chen, Jesús Villalba, Najim Dehak:
ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual Networks. INTERSPEECH 2019: 1013-1017 - [c13]Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Alan McCree, Gregory Sell, Jonas Borgstrom, Fred Richardson, Suwon Shon, François Grondin, Réda Dehak, Leibny Paola García-Perera, Daniel Povey, Pedro A. Torres-Carrasquillo, Sanjeev Khudanpur, Najim Dehak:
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18. INTERSPEECH 2019: 1488-1492 - [c12]David Snyder, Jesús Villalba, Nanxin Chen, Daniel Povey, Gregory Sell, Najim Dehak, Sanjeev Khudanpur:
The JHU Speaker Recognition System for the VOiCES 2019 Challenge. INTERSPEECH 2019: 2468-2472 - [c11]Nanxin Chen, Jesús Villalba, Najim Dehak:
Tied Mixture of Factor Analyzers Layer to Combine Frame Level Representations in Neural Speaker Embeddings. INTERSPEECH 2019: 2948-2952 - [i5]Cheng-I Lai, Nanxin Chen, Jesús Villalba, Najim Dehak:
ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks. CoRR abs/1904.01120 (2019) - [i4]Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang:
A Comparative Study on Transformer vs RNN in Speech Applications. CoRR abs/1909.06317 (2019) - [i3]Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, Nanxin Chen, Paola García, Najim Dehak:
Feature Enhancement with Deep Feature Losses for Speaker Verification. CoRR abs/1910.11905 (2019) - [i2]Nanxin Chen, Shinji Watanabe, Jesús Villalba, Najim Dehak:
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition. CoRR abs/1911.04908 (2019) - 2018
- [j4]Rubén Zazo-Candil, Phani Sankar Nidadavolu, Nanxin Chen, Joaquin Gonzalez-Rodriguez, Najim Dehak:
Age Estimation in Short Speech Utterances Based on LSTM Recurrent Neural Networks. IEEE Access 6: 22524-22530 (2018) - [c10]Nanxin Chen, Jesús Villalba, Yishay Carmiel, Najim Dehak:
Measuring Uncertainty in Deep Regression Models: The Case of Age Estimation from Speech. ICASSP 2018: 4939-4943 - [c9]Nanxin Chen, Jesús Villalba, Najim Dehak:
An Investigation of Non-linear i-vectors for Speaker Verification. INTERSPEECH 2018: 87-91 - [c8]Pegah Ghahremani, Phani Sankar Nidadavolu, Nanxin Chen, Jesús Villalba, Daniel Povey, Sanjeev Khudanpur, Najim Dehak:
End-to-end Deep Neural Network Age Estimation. INTERSPEECH 2018: 277-281 - [c7]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. INTERSPEECH 2018: 2207-2211 - [c6]Fred Richardson, Pedro A. Torres-Carrasquillo, Jonas Borgstrom, Douglas E. Sturim, Youngjune Gwon, Jesús Villalba, Jan Trmal, Nanxin Chen, Réda Dehak, Najim Dehak:
The MIT Lincoln Laboratory / JHU / EPITA-LSE LRE17 System. Odyssey 2018: 54-59 - [i1]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. CoRR abs/1804.00015 (2018) - 2017
- [j3]Yanmin Qian, Nanxin Chen, Heinrich Dinkel, Zhizheng Wu:
Deep Feature Engineering for Noise Robust Spoofing Detection. IEEE ACM Trans. Audio Speech Lang. Process. 25(10): 1942-1955 (2017) - [c5]Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Kai Yu:
End-to-end spoofing detection with raw waveform CLDNNS. ICASSP 2017: 4860-4864 - 2016
- [j2]Yanmin Qian, Nanxin Chen, Kai Yu:
Deep features for automatic spoofing detection. Speech Commun. 85: 43-52 (2016) - [c4]Pavel Korshunov, Sébastien Marcel, Hannah Muckenhirn, André R. Gonçalves, A. G. Souza Mello, Ricardo Paranhos Velloso Violato, Flávio Olmos Simões, Mário Uliani Neto, Marcus de Assis Angeloni, José Augusto Stuchi, Heinrich Dinkel, Nanxin Chen, Yanmin Qian, Dipjyoti Paul, Goutam Saha, Md. Sahidullah:
Overview of BTAS 2016 speaker anti-spoofing competition. BTAS 2016: 1-6 - 2015
- [j1]Yuan Liu, Yanmin Qian, Nanxin Chen, Tianfan Fu, Ya Zhang, Kai Yu:
Deep feature for text-dependent speaker verification. Speech Commun. 73: 1-13 (2015) - [c3]Nanxin Chen, Yanmin Qian, Kai Yu:
Multi-task learning for text-dependent speaker verification. INTERSPEECH 2015: 185-189 - [c2]Nanxin Chen, Yanmin Qian, Heinrich Dinkel, Bo Chen, Kai Yu:
Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge. INTERSPEECH 2015: 2097-2101 - 2014
- [c1]Nanxin Chen, Qingling Duan, Jianqin Wang, Ruizhi Sun:
Development of Early-Warning Model for Intensive Pig Breeding. CCTA 2014: 616-626
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:24 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint