default search action
Yu Zhang 0033
Person information
- affiliation: Google
- affiliation (PhD 2017): Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
Other persons with the same name
- Yu Zhang — disambiguation page
- Yu Zhang 0001 — Loughborough University, Department of Aeronautical and Automotive Engineering, UK (and 2 more)
- Yu Zhang 0002 — Pennsylvania State University, University Park, PA, USA (and 2 more)
- Yu Zhang 0003 — Hainan Normal University, Haikou, China
- Yu Zhang 0004 — Southeast University, School of Computer Science and Engineering, Nanjing, China (and 1 more)
- Yu Zhang 0005 — University of California, Santa Cruz, CA, USA (and 2 more)
- Yu Zhang 0006 — Southern University of Science and Technology, Shenzhen, China (and 3 more)
- Yu Zhang 0007 — Microsoft Research Asia, China
- Yu Zhang 0008 — Zhejiang University, College of Computer Science, Hangzhou, China
- Yu Zhang 0009 — Lehigh University, Department of Bioengineering, Bethlehem, PA, USA (and 2 more)
- Yu Zhang 0010 — Northwestern University, Department of Chemistry & Center for Bio-inspired Energy Science (CBES), Evanston, IL, USA
- Yu Zhang 0011 — South China Normal University, Guangzhou, China
- Yu Zhang 0012 — Southeast University, National Mobile Communications Research Laboratory, Nanjing, China
- Yu Zhang 0013 — Pennsylvania State University, College of Information Sciences and Technology, State College, PA, USA
- Yu Zhang 0014 — Xidian University, State Key Laboratory of Integrated Services Networks, Xi'an, China
- Yu Zhang 0015 — Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
- Yu Zhang 0016 — Jilin University, College of Computer Science and Technology, Changchun, China
- Yu Zhang 0017 — Sichuan University, West China Hospital, Department of Radiology, Chengdu, China
- Yu Zhang 0018 — Zhejiang University, College of Control Science and Engineering, Hangzhou, China (and 3 more)
- Yu Zhang 0019 — Wuhan University, Electronic and Information School, Signal Processing Lab, China
- Yu Zhang 0020 — Tongji University, School of Mathematics, Shanghai, China
- Yu Zhang 0021 — Guangdong Peizheng College, English Department, Guangzhou, China
- Yu Zhang 0022 — University of California Santa Barbara, Department of Education, CA, USA
- Yu Zhang 0023 — Hubei University for Nationalities, School of Information Engineering, Enshi, China
- Yu Zhang 0024 — Northeastern University, Software College, Shenyang, China
- Yu Zhang 0025 — Microsoft, Online Service Division, Sunnyvale, CA, USA (and 1 more)
- Yu Zhang 0026 — Tsinghua University, Department of Electronic Engineering, Beijing, China
- Yu Zhang 0027 — Huazhong University of Science and Technology, School of Computer Science and Technology, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, Wuhan, China
- Yu Zhang 0028 — Data Storage Institute, A-STAR, Singapore (and 1 more)
- Yu Zhang 0029 — Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou, China
- Yu Zhang 0030 — Harbin Institute of Technology, Research Center for Social Computing and Information Retrieval, China
- Yu Zhang 0031 — Xinyang Normal University, Department of Computer Science and Technology, China
- Yu Zhang 0032 — Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering, China
- Yu Zhang 0034 — RMIT University, Melbourne, VIC, Australia (and 1 more)
- Yu Zhang 0035 — SenseTime Group Limited, Beijing, China (and 1 more)
- Yu Zhang 0036 — Harbin Institute of Technology, School of Computer Science and Technology, China
- Yu Zhang 0037 — Beijing Institute of Technology, School of Mechanical Engineering, China
- Yu Zhang 0038 — Harbin Engineering University, College of Information and Communication Engineering, China
- Yu Zhang 0039 — Huazhong University of Science and Technology, School of Electronic Information and Communications, Wuhan, China
- Yu Zhang 0040 — Shaanxi Normal University, MOE Key Laboratory of Modern Teaching Technology, Xi'an, China
- Yu Zhang 0041 — University of North Carolina at Chapel Hill, University of North Carolina at Chapel Hill, NC, USA
- Yu Zhang 0042 — State Grid Energy Research Institute Co., Ltd., Beijing, China (and 2 more)
- Yu Zhang 0043 — University of Oxford, Department of Computer Science, UK (and 1 more)
- Yu Zhang 0044 — Texas A&M University, Department of Computer Science & Engineering, TX, USA (and 1 more)
- Yu Zhang 0045 — Harbin Institute of Technology, State Key Laboratory of Robotics and System, Harbin, China
- Yu Zhang 0046 — Xiamen University, Key Laboratory of Underwater Acoustic Communication and Marine Information Technology, Xiamen, China
- Yu Zhang 0047 — Tsinghua University, Department of Automation, Beijing, China
- Yu Zhang 0048 — Northeast Electric Power University, College of Information Engineering, Jilin, China
- Yu Zhang 0049 — Macquarie University, Australian School of Advanced Medicine, Sydney, Australia
- Yu Zhang 0050 — Tsinghua University, Department of Electronic Engineering, Beijing, China (and 1 more)
- Yu Zhang 0051 — Chemnitz University of Technology, Germany
- Yu Zhang 0052 — École centrale de Lyon, France
- Yu Zhang 0053 — Cardiff University, UK
- Yu Zhang 0054 — Qualcomm Inc., Beijing, China (and 1 more)
- Yu Zhang 0055 — Arizona State University, Computer Science and Engineering Department, Tempe, AZ, USA (and 1 more)
- Yu Zhang 0056 — Anhui University, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Hefei, China (and 1 more)
- Yu Zhang 0057 — Air Force Engineering University, Aeronautics and Astronautics Engineering College, Xi'an, China
- Yu Zhang 0058 — Chongqing University, College of Communication Engineering, China
- Yu Zhang 0059 — University of California, Los Angeles, USA
- Yu Zhang 0060 — National University of Defense Technology, Key Laboratory of Science and Technology on ATR, Changsha, China
- Yu Zhang 0061 — University of Tokyo, Graduate School of Agricultural and Life Sciences, Japan
- Yu Zhang 0062 — Xidian University, Video and Image Processing System Laboratory, China
- Yu Zhang 0063 — Chinese University of Hong Kong, Electronic Engineering Department
- Yu Zhang 0064 — Southern Medical University, School of Biomedical Engineering, Guangzhou, China
- Yu Zhang 0065 — Yanshan University, Institute of Electrical Engineering, Qinhuangdao, China
- Yu Zhang 0066 — China South Industries Group Corporation, Weapon Equipment Research Institute, Beijing, China
- Yu Zhang 0067 — Liaoning Technical University, School of Science, Fuxin, China
- Yu Zhang 0068 — Nanjing University of Aeronautics and Astronautics, Key Laboratory of Radar Imaging and Microwave Photonics, Nanjing, China
- Yu Zhang 0069 — Chinese Academy of Sciences and Ministry of Water Resources, Institute of Soil and Water Conservation, Yangling, China
- Yu Zhang 0070 — Shanghai Ocean University, College of Marine Sciences, China
- Yu Zhang 0071 — Hainan University, State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, China
- Yu Zhang 0072 — Fudan University, Shanghai Key Laboratory of Intelligent Information Processing, China
- Yu Zhang 0073 — Northeastern University, Department of Systems Engineering, State Key Lab of Synthetic Automation of Process Industries, Shenyang, China
- Yu Zhang 0074 — Pennsylvania State University, Department of Civil and Environmental Engineering, University Park, USA
- Yu Zhang 0075 — Institute of High Performance Computing, Singapore (and 1 more)
- Yu Zhang 0076 — Southwest Jiaotong University, School of Physical Science and Technology, Chengdu, China
- Yu Zhang 0077 — Hangzhou Dianzi University, School of Electronics and Information, China (and 1 more)
- Yu Zhang 0078 — Tianjin University, School of Precision Instrument and Opto-electronics Engineering, State Key Laboratory of Precision Measuring Technology and Instruments, China (and 1 more)
- Yu Zhang 0079 — Beijing Institute of Technology, School of Information and Electronics, China (and 1 more)
- Yu Zhang 0080 — Chinese Academy of Space Technology, Beijing Orient Institute of Measurement and Test, China (and 1 more)
- Yu Zhang 0081 — University of Science and Technology Beijing, Donlinks School of Economics and Management, China (and 1 more)
- Yu Zhang 0082 — Army Engineering University of PLA, College of Communication Engineering, Nanjing, China
- Yu Zhang 0083 — Tsinghua University, Institute of Education, Beijing, China (and 1 more)
- Yu Zhang 0084 — Nanyang Technological University, School of Computer Science and Engineering, Singapore
- Yu Zhang 0085 — Southwest University, College of Computer and Information Science, Chongqing, China (and 1 more)
- Yu Zhang 0086 — University of Science and Technology of China, Lab for Intelligent Networking and Knowledge Engineering, Hefei,China
- Yu Zhang 0087 — University of South Florida, Department of Civil and Environmental Engineering, Tampa, FL, USA (and 1 more)
- Yu Zhang 0088 — Delphi Automotive, Agoura Hills, CA, USA (and 1 more)
- Yu Zhang 0089 — Jiangsu Normal University, School of Geography, Geomatics and Planning, Department of Land Resource Management, Xuzhou, China (and 1 more)
- Yu Zhang 0090 — Beijing Normal University, College of Aritficial Intelligence, China
- Yu Zhang 0091 — Harbin Institute of Technology, School of Astronautics, National Key Laboratory of Tunable Laser Technology, China
- Yu Zhang 0092 — Soochow University, China
- Yu Zhang 0093 — Macquarie University, Sydney, Australia (and 1 more)
- Yu Zhang 0094 — University of Kentucky, Department of Computer Science, Lexington, KY, USA
- Yu Zhang 0095 — Nankai University, College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Tianjin, China (and 1 more)
- Yu Zhang 0096 (aka: Fiona Zhang) — Hong Kong Polytechnic University, Department of Mechanical Engineering, Hong Kong (and 2 more)
- Yu Zhang 0097 — City University of Hong Kong, Hong Kong (and 2 more)
- Yu Zhang 0098 — Fujian Jiangxia University, Department of Electronic Information Science, Fuzhou, China
- Yu Zhang 0099 — University of California Davis, Department of Electrical and Computer Engineering, CA, USA
- Yu Zhang 0100 — Chongqing University of Posts and Telecommunications, School of Economics and Management, China
- Yu Zhang 0101 — Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, China
- Yu Zhang 0102 — Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China (and 1 more)
- Yu Zhang 0103 — Nanjing University of Aeronautics and Astronautics, Department of Mathematics, State Key Laboratory of Mechanics and Control of Mechanical Structures, China
- Yu Zhang 0104 — Chinese Academy of Sciences, Key Laboratory of Ecosystem Network Observation and Modeling, Beijing, China (and 1 more)
- Yu Zhang 0105 — Information Engineering University Zhengzhou, College of Cryptographic Engineering, China
- Yu Zhang 0106 — Changchun Institute of Technology, College of Computer Science and Engineering, China
- Yu Zhang 0107 — Hunan University, College of Computer Science and Electronic Engineering, Changsha, China
- Yu Zhang 0108 — East China University of Science and Technology, MOE Key Laboratory of Advanced Control and Optimization for Chemical Process, Shanghai, China (and 1 more)
- Yu Zhang 0109 — Tongji University, MOE Key Laboratory of Road and Traffic Engineering, Shanghai, China
- Yu Zhang 0110 — Jilin University, College of Electronic Science and Engineering, State Key Laboratory of Integrated Optoelectronics, Changchun, China
- Yu Zhang 0111 — Southern University of Science and Technology, Department of Computer Science and Engineering, Guangdong Key Laboratory of Brain-Inspired Intelligent Computation, Shenzhen, China
- Yu Zhang 0112 — East China Normal University, Department of Computer Science and Technology, Shanghai, China
- Yu Zhang 0113 — Dalian Medical University, Second Affiliated Hospital, China
- Yu Zhang 0114 — Catalonia Institute for Energy Research (IREC), Spain
- Yu Zhang 0115 — University of Montreal, Department of Psychology, QC, Canada
- Yu Zhang 0116 — Shenyang University of Technology, Department of Mechanical Engineering, China
- Yu Zhang 0117 — Beijing University of Posts and Telecommunications, State Key Laboratory of Networking and Switching Technology, China (and 1 more)
- Yu Zhang 0118 — Liaoning Technical University, School of Software, Fuxin, China
- Yu Zhang 0119 — Shenzhen University, College of Life Sciences and Oceanography, Guangdong Engineering Research Center for Marine Algal Biotechnology, China
- Yu Zhang 0120 — Shandong Normal University, School of Information Science and Engineering, China
- Yu Zhang 0121 — Zhejiang Normal University, Institute of Precision Machinery and Smart Structure, College of Engineering, Jinhua, China
- Yu Zhang 0122 — China Academy of Information and Communications Technology, Beijing, China
- Yu Zhang 0123 — Arizona State University, School of Electrical, Computer and Energy Engineering, Tempe, AZ, USA
- Yu Zhang 0124 — Southeast University, School of Computer Science and Engineering, Nanjing, China (and 2 more)
- Yu Zhang 0125 — University of Macau, Faculty of Sciences and Technology, China (and 2 more)
- Yu Zhang 0126 — Zhejiang University, Hangzhou, China
- Yu Zhang 0127 — University of Münster, Germany
- Yu Zhang 0128 — University of Sheffield, Department of Computer Science, UK
- Yu Zhang 0129 — Nanjing Agricultural University, National Engineering and Technology Center for Information Agriculture, China
- Yu Zhang 0130 — Chinese Academy of Sciences, Computer Network Information Center, Beijing, China
- Yu Zhang 0131 — China University of Mining and Technology, Internet of Things (Perception Mine) Research Center, Xuzhou, China
- Yu Zhang 0132 — Ohio State University, School of Earth Sciences, Columbus, OH, USA
- Yu Zhang 0133 — Tongji University, Shanghai, China (and 2 more)
- Yu Zhang 0134 — Southeast University, Department of Construction and Real Estate, Nanjing, China
- Yu Zhang 0135 — Southeast University, School of Instrumentation Science and Engineering, Nanjing, China (and 1 more)
- Yu Zhang 0136 — Harbin Institute of Technology, School of Architecture, China
- Yu Zhang 0137 — China University of Petroleum (East China), College of Computer Science and Technology, Qingdao, China
- Yu Zhang 0138 — Zhengzhou Normal University, School of Information Science and Technology, China (and 1 more)
- Yu Zhang 0139 — Guilin University of Technology, College of Mechanical and Control Engineering, China
- Yu Zhang 0140 — Shanghai Jiao Tong University, Institute of Oceanography, China
- Yu Zhang 0141 — Shanghai Maritime University, College of Transport and Communications, China
- Yu Zhang 0142 — State Grid Liaoning Electric Power Co., Ltd, Information and Communication Branch, China
- Yu Zhang 0143 — Beijing University of Technology, Faculty of Information Technology, China
- Yu Zhang 0144 — South China University of Technology, Guangzhou, China
- Yu Zhang 0145 — State Grid Jiangsu Electric Power Co. Ltd., Electric Power Research Institute, China
- Yu Zhang 0146 — Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
- Yu Zhang 0147 — Xidian University, Academy of Advanced Interdisciplinary Research, Xi'an, China
- Yu Zhang 0148 — Iowa State University, Department of Electrical and Computer Engineering, Ames, IA, USA
- Yu Zhang 0149 — Guangdong Academy of Medical Sciences, Department of Orthopaedics, Guangzhou, China (and 1 more)
- Yu Zhang 0150 — Trinity University, Department of Computer Science, San Antonio, TX, USA (and 1 more)
- Yu Zhang 0151 — Southern University of Science and Technology, First Affiliated Hospital, Second Clinical Medicine College of Jinan University, Shenzhen, China (and 1 more)
- Yu Zhang 0152 — Tsinghua University, School of Aerospace Engineering, Beijing, China
- Yu Zhang 0153 — Chongqing Jiaotong University, School of Economics and Management, China
- Yu Zhang 0154 — Tsinghua University, Department of Electronic Engineering, Beijing, China
- Yu Zhang 0155 — Wuhan University, GNSS Research Center, China
- Yu Zhang 0156 — China University of Geosciences, School of Geophysics and Information Technology, Beijing, China
- Yu Zhang 0157 — Xi'an Physical Education University, China
- Yu Zhang 0158 — Zhongnan University of Economics and Law, Institute of Operations Management and System Engineering, School of Business Administration, Wuhan, China
- Yu Zhang 0159 — Southwestern University of Finance and Economics, Research Institute of Economics and Management, Chengdu, China (and 1 more)
- Yu Zhang 0160 — Shanxi Normal University, School of Geographical Sciences, Taiyuan, China
- Yu Zhang 0161 — ETH Zurich, Engineering Design and Computing Lab, Switzerland
- Yu Zhang 0162 — Anhui University of Science and Technology, School of Computer Science and Engineering, Huainan, China
- Yu Zhang 0163 — Guangxi University of Finance and Economics, School of Management Science and Engineering, Nanning, China
- Yu Zhang 0164 — Guangxi Normal University, College of Electronic Engineering, Guilin, China
- Yu Zhang 0165 — Meituan, Shanghai, China (and 1 more)
- Yu Zhang 0166 — Prometheus Vision Technology, Zhuhai, China
- Yu Zhang 0167 — State Grid Gansu Electric Power Company Integrated Service Center, China
- Yu Zhang 0168 — Beijing Institute of Satellite Environmental Engineering, China
- Yu Zhang 0169 — Minzu University of China, China
- Yu Zhang 0170 — Hebei University, School of Cyber Security and Computer, Key Laboratory on High Trusted Information System in Hebei Province, China
- Yu Zhang 0171 — German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany (and 1 more)
- Yu Zhang 0172 — Siemens (China) Co. Ltd., China
- Yu Zhang 0173 — Shanghai Polytechnic University, School of Computer and Electronic Information, China
- Yu Zhang 0174 — Dongfeng Nissan Passenger Vehicle Compnay, China
- Yu Zhang 0175 — Beijing Institute of Technology, China
- Yu Zhang 0176 — Alibaba Group, Beijing, China
- Yu Zhang 0177 — University of Sydney, Faculty of Engineering, NSW, Australia
- Yu Zhang 0178 — Xi'an University of Posts and Telecommunications, School of Communications and Information Engineering / School of Artificial Intelligence, China
- Yu Zhang 0179 — Kunming University of Science and Technology, China
- Yu Zhang 0180 — National Institute of Metrology, China
- Yu Zhang 0181 — Dalian Maritime University, College of Information Science and Technology, China
- Yu Zhang 0182 — Technical University of Munich, Department of Computation, Information and Technology, Garching, Germany
- Yu Zhang 0183 — National Satellite Meteorological Center of China, Beijing, China (and 1 more)
- Yu Zhang 0184 — Chongqing University, College of Computer Science, China
- Yu Zhang 0185 — Tencent HealthCare, Shenzhen, China (and 1 more)
- Yu Zhang 0186 — Central Conservatory of Music, Department of Music AI and Information Technology, Beijing, China
- Yu Zhang 0187 — Department of Criminology, California State University, Fresno, CA, USA
- Yu Zhang 0188 — Purdue University, West Lafayette, IN, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c111]Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan L. Yuille, Jiahui Yu:
IG Captioner: Information Gain Captioners Are Strong Zero-Shot Classifiers. ECCV (64) 2024: 474-490 - [c110]W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. ICASSP 2024: 13306-13310 - [i97]W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. CoRR abs/2401.12789 (2024) - 2023
- [c109]Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-Based Text To Speech. ASRU 2023: 1-8 - [c108]Ke Hu, Tara N. Sainath, Bo Li, Yu Zhang, Yong Cheng, Tao Wang, Yujing Zhang, Frederick Liu:
Improving Multilingual and Code-Switching ASR Using Large Language Model Generated Text. ASRU 2023: 1-7 - [c107]Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the Thin Gap Between Speech and Text Foundation Models. ASRU 2023: 1-8 - [c106]Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. ICASSP 2023: 1-5 - [c105]Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-Scale ASR. ICASSP 2023: 1-5 - [c104]Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. ICASSP 2023: 1-5 - [c103]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. ICASSP 2023: 1-5 - [c102]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5 - [c101]Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani:
Accelerating RNN-T Training and Inference Using CTC Guidance. ICASSP 2023: 1-5 - [c100]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. ICASSP 2023: 1-5 - [c99]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. ICASSP 2023: 1-5 - [c98]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. ICASSP 2023: 1-5 - [c97]Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna:
Mu2SLAM: Multitask, Multilingual Speech and Language Models. ICML 2023: 5504-5520 - [c96]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023: 456-460 - [c95]Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Françoise Beaufays:
Mixture-of-Expert Conformer for Streaming Multilingual ASR. INTERSPEECH 2023: 3327-3331 - [c94]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. INTERSPEECH 2023: 5496-5500 - [c93]Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. NeurIPS 2023 - [c92]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. WASPAA 2023: 1-5 - [i96]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. CoRR abs/2301.07851 (2023) - [i95]Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. CoRR abs/2302.01496 (2023) - [i94]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. CoRR abs/2302.08583 (2023) - [i93]Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. CoRR abs/2302.08917 (2023) - [i92]Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023) - [i91]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. CoRR abs/2303.01664 (2023) - [i90]Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. CoRR abs/2304.04947 (2023) - [i89]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. CoRR abs/2304.14514 (2023) - [i88]Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Françoise Beaufays:
Mixture-of-Expert Conformer for Streaming Multilingual ASR. CoRR abs/2305.15663 (2023) - [i87]Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. CoRR abs/2305.18802 (2023) - [i86]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? CoRR abs/2306.01015 (2023) - [i85]Nanxin Chen, Izhak Shafran, Yu Zhang, Chung-Cheng Chiu, Hagen Soltau, James Qin, Yonghui Wu:
Efficient Adapters for Giant Speech Models. CoRR abs/2306.08131 (2023) - [i84]Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara N. Sainath, Johan Schalkwyk, Matthew Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirovic, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Havnø Frank:
AudioPaLM: A Large Language Model That Can Speak and Listen. CoRR abs/2306.12925 (2023) - [i83]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling For Spoken Language Identification. CoRR abs/2309.10567 (2023) - [i82]Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the thin gap between speech and text foundation models. CoRR abs/2310.00230 (2023) - [i81]Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-based Text to Speech. CoRR abs/2311.00945 (2023) - [i80]Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan L. Yuille, Jiahui Yu:
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers. CoRR abs/2311.17072 (2023) - 2022
- [j3]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J. Sel. Top. Signal Process. 16(6): 1357-1366 (2022) - [j2]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE J. Sel. Top. Signal Process. 16(6): 1519-1532 (2022) - [c91]Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using self-Supervised Conformers. ICASSP 2022: 3169-3173 - [c90]Bo Li, Ruoming Pang, Yu Zhang, Tara N. Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad:
Massively Multilingual ASR: A Lifelong Learning Solution. ICASSP 2022: 6397-6401 - [c89]Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. ICASSP 2022: 6402-6406 - [c88]Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. ICASSP 2022: 6537-6541 - [c87]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681 - [c86]Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. ICASSP 2022: 8112-8116 - [c85]Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobu Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. INTERSPEECH 2022: 1721-1725 - [c84]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. INTERSPEECH 2022: 2193-2197 - [c83]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Nicolás Serrano:
Reducing Domain mismatch in Self-supervised speech pre-training. INTERSPEECH 2022: 3028-3032 - [c82]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. INTERSPEECH 2022: 3248-3252 - [c81]Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani:
Unsupervised Data Selection via Discrete Speech Representation for ASR. INTERSPEECH 2022: 3393-3397 - [c80]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097 - [c79]Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. INTERSPEECH 2022: 4571-4575 - [c78]Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model for ASR. SLT 2022: 52-59 - [c77]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75 - [c76]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. SLT 2022: 197-204 - [c75]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech. SLT 2022: 798-805 - [c74]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings. SLT 2022: 1112-1119 - [i79]Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau:
mSLAM: Massively multilingual joint pre-training for speech and text. CoRR abs/2202.01374 (2022) - [i78]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. CoRR abs/2202.12719 (2022) - [i77]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. CoRR abs/2203.10752 (2022) - [i76]Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. CoRR abs/2203.13339 (2022) - [i75]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. CoRR abs/2203.16104 (2022) - [i74]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022) - [i73]Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech. CoRR abs/2205.12446 (2022) - [i72]Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. CoRR abs/2208.13183 (2022) - [i71]Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR. CoRR abs/2210.05793 (2022) - [i70]Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model For ASR. CoRR abs/2210.07353 (2022) - [i69]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving generalizability of distilled self-supervised speech processing models under distorted settings. CoRR abs/2210.07978 (2022) - [i68]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022) - [i67]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022) - [i66]Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding:
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation. CoRR abs/2210.15868 (2022) - [i65]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. CoRR abs/2210.17049 (2022) - [i64]Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. CoRR abs/2211.01263 (2022) - [i63]Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna:
Mu2SLAM: Multitask, Multilingual Speech and Language Models. CoRR abs/2212.09553 (2022) - 2021
- [c73]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. ASRU 2021: 244-250 - [c72]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258 - [c71]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai:
Scaling End-to-End Models for Large-Scale Multilingual ASR. ASRU 2021: 1011-1018 - [c70]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster end-to-end Model for Streaming ASR. ICASSP 2021: 5634-5638 - [c69]Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. ICASSP 2021: 5669-5673 - [c68]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, Ron J. Weiss, Yonghui Wu:
Parallel Tacotron: Non-Autoregressive and Controllable TTS. ICASSP 2021: 5709-5713 - [c67]Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman:
Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition. ICASSP 2021: 6388-6392 - [c66]David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence for Subword End-To-End ASR. ICASSP 2021: 6393-6397 - [c65]Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data. ICASSP 2021: 6558-6562 - [c64]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. ICLR 2021 - [c63]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. Interspeech 2021: 141-145 - [c62]Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. Interspeech 2021: 151-155 - [c61]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740 - [c60]Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models. Interspeech 2021: 3460-3464 - [c59]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Interspeech 2021: 3765-3769 - [c58]Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. Interspeech 2021: 4069-4073 - [c57]David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. Interspeech 2021: 4074-4078 - [c56]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093 - [c55]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. SLT 2021: 873-880 - [i62]Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. CoRR abs/2102.09114 (2021) - [i61]David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence For Subword End-to-End ASR. CoRR abs/2103.06716 (2021) - [i60]Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. CoRR abs/2103.14152 (2021) - [i59]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. CoRR abs/2103.14574 (2021) - [i58]Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. CoRR abs/2103.15060 (2021) - [i57]William Chan, Daniel S. Park, Chris A. Lee, Yu Zhang, Quoc V. Le, Mohammad Norouzi:
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network. CoRR abs/2104.02133 (2021) - [i56]Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models. CoRR abs/2104.02757 (2021) - [i55]David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. CoRR abs/2104.12870 (2021) - [i54]Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma:
Scaling End-to-End Models for Large-Scale Multilingual ASR. CoRR abs/2104.14830 (2021) - [i53]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. CoRR abs/2106.09660 (2021) - [i52]Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. CoRR abs/2108.06209 (2021) - [i51]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021) - [i50]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2109.13226 (2021) - [i49]Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. CoRR abs/2110.03327 (2021) - [i48]Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers. CoRR abs/2110.04621 (2021) - [i47]Ankur Bapna, Yu-An Chung, Nan Wu, Anmol Gulati, Ye Jia, Jonathan H. Clark, Melvin Johnson, Jason Riesa, Alexis Conneau, Yu Zhang:
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training. CoRR abs/2110.10329 (2021) - [i46]Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. CoRR abs/2111.08137 (2021) - 2020
- [c54]Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov:
Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CVPR 2020: 2443-2451 - [c53]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063 - [c52]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. ICASSP 2020: 6264-6268 - [c51]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703 - [c50]Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu:
Specaugment on Large Scale Datasets. ICASSP 2020: 6879-6883 - [c49]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033 - [c48]Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan:
Speech Sentiment Analysis via Pre-Trained Features from End-to-End ASR Models. ICASSP 2020: 7149-7153 - [c47]Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. ICASSP 2020: 7654-7658 - [c46]Zelin Wu, Bo Li, Yu Zhang, Petar S. Aleksic, Tara N. Sainath:
Multistate Encoding with End-To-End Speech RNN Transducer Network. ICASSP 2020: 7819-7823 - [c45]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560 - [c44]Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le:
Improved Noisy Student Training for Automatic Speech Recognition. INTERSPEECH 2020: 2817-2821 - [c43]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836 - [c42]Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. INTERSPEECH 2020: 3610-3614 - [c41]Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. INTERSPEECH 2020: 5036-5040 - [c40]Eric Chen, Zhiyun Lu, Hao Xu, Liangliang Cao, Yu Zhang, James Fan:
A Large Scale Speech Sentiment Corpus. LREC 2020: 6549-6555 - [i45]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. CoRR abs/2002.03785 (2020) - [i44]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020) - [i43]Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency. CoRR abs/2003.12710 (2020) - [i42]Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. CoRR abs/2005.03191 (2020) - [i41]Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. CoRR abs/2005.03271 (2020) - [i40]Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. CoRR abs/2005.08100 (2020) - [i39]Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le:
Improved Noisy Student Training for Automatic Speech Recognition. CoRR abs/2005.09629 (2020) - [i38]Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. CoRR abs/2009.00713 (2020) - [i37]Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu:
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling. CoRR abs/2010.04301 (2020) - [i36]Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu:
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2010.10504 (2020) - [i35]Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman:
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition. CoRR abs/2010.11428 (2020) - [i34]Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, Ron J. Weiss, Yonghui Wu:
Parallel Tacotron: Non-Autoregressive and Controllable TTS. CoRR abs/2010.11439 (2020) - [i33]Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data. CoRR abs/2010.12096 (2020) - [i32]Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. CoRR abs/2010.12973 (2020) - [i31]Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster End-to-End Model for Streaming ASR. CoRR abs/2011.10798 (2020)
2010 – 2019
- 2019
- [c39]Chung-Cheng Chiu, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang:
A Comparison of End-to-End Models for Long-Form Speech Recognition. ASRU 2019: 889-896 - [c38]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. ASRU 2019: 996-1002 - [c37]Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan:
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. CoRL 2019: 923-932 - [c36]Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes. ICASSP 2019: 5621-5625 - [c35]Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Yu-An Chung, Yuxuan Wang, Yonghui Wu, James R. Glass:
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization. ICASSP 2019: 5901-5905 - [c34]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency Training for End-to-end Speech Recognition. ICASSP 2019: 6271-6275 - [c33]Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis. ICASSP 2019: 6940-6944 - [c32]Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. ICLR (Poster) 2019 - [c31]Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu:
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. INTERSPEECH 2019: 1526-1530 - [c30]Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. INTERSPEECH 2019: 2080-2084 - [c29]Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le:
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. INTERSPEECH 2019: 2613-2617 - [i30]Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019) - [i29]Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu:
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. CoRR abs/1904.02882 (2019) - [i28]Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le:
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. CoRR abs/1904.08779 (2019) - [i27]Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. CoRR abs/1907.04448 (2019) - [i26]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. CoRR abs/1909.11699 (2019) - [i25]Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan:
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. CoRR abs/1910.06528 (2019) - [i24]Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. CoRR abs/1910.10909 (2019) - [i23]Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling:
The ASVspoof 2019 database. CoRR abs/1911.01601 (2019) - [i22]Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu:
A comparison of end-to-end models for long-form speech recognition. CoRR abs/1911.02242 (2019) - [i21]Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan:
Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models. CoRR abs/1911.09762 (2019) - [i20]Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov:
Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CoRR abs/1912.04838 (2019) - [i19]Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu:
SpecAugment on Large Scale Datasets. CoRR abs/1912.05533 (2019) - 2018
- [c28]Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions. ICASSP 2018: 4779-4783 - [c27]Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Ye Jia, Fei Ren, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML 2018: 5167-5176 - [c26]Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. NeurIPS 2018: 4485-4495 - [c25]Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for end-to-end ASR. SLT 2018: 426-433 - [i18]Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. CoRR abs/1803.09017 (2018) - [i17]Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. CoRR abs/1806.04558 (2018) - [i16]Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for End-to-End ASR. CoRR abs/1807.10893 (2018) - [i15]Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis. CoRR abs/1808.10128 (2018) - [i14]Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. CoRR abs/1810.07217 (2018) - [i13]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency training for end-to-end speech recognition. CoRR abs/1811.01690 (2018) - [i12]Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes. CoRR abs/1811.09021 (2018) - 2017
- [b1]Yu Zhang:
Exploring neural network architectures for acoustic modeling. Massachusetts Institute of Technology, Cambridge, USA, 2017 - [c24]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation. ASRU 2017: 16-23 - [c23]William Chan, Yu Zhang, Quoc V. Le, Navdeep Jaitly:
Latent Sequence Decompositions. ICLR (Poster) 2017 - [c22]Takaaki Hori, Shinji Watanabe, Yu Zhang, William Chan:
Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM. INTERSPEECH 2017: 949-953 - [c21]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. INTERSPEECH 2017: 1273-1277 - [c20]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. NIPS 2017: 1878-1889 - [p3]Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael I. Mandel, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu:
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 79-104 - [p2]Yu Zhang, Dong Yu, Guoguo Chen:
Advanced Recurrent Neural Networks for Automatic Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 261-279 - [p1]Guoguo Chen, Yu Zhang, Dong Yu:
Sequence-Discriminative Training of Neural Networks. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 281-297 - [i11]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. CoRR abs/1704.04222 (2017) - [i10]Takaaki Hori, Shinji Watanabe, Yu Zhang, William Chan:
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM. CoRR abs/1706.02737 (2017) - [i9]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation. CoRR abs/1707.06265 (2017) - [i8]Tao Lei, Yu Zhang, Yoav Artzi:
Training RNNs as Fast as CNNs. CoRR abs/1709.02755 (2017) - [i7]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. CoRR abs/1709.07902 (2017) - [i6]Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. CoRR abs/1712.05884 (2017) - 2016
- [c19]Salvatore Romeo, Giovanni Da San Martino, Alberto Barrón-Cedeño, Alessandro Moschitti, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Mitra Mohtarami, James R. Glass:
Neural Attention for Learning to Rank Questions in Community Question Answering. COLING 2016: 1734-1745 - [c18]Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai Sim, Xiong Xiao, Yu Zhang:
Speaker-aware training of LSTM-RNNS for acoustic modelling. ICASSP 2016: 5280-5284 - [c17]Ekapol Chuangsuwanich, Yu Zhang, James R. Glass:
Multilingual data selection for training stacked bottleneck features. ICASSP 2016: 5410-5414 - [c16]Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-adaptation-correction recurrent neural networks for low-resource language speech recognition. ICASSP 2016: 5415-5419 - [c15]Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Michael I. Mandel, Dong Yu:
Deep beamforming networks for multi-channel speech recognition. ICASSP 2016: 5745-5749 - [c14]Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway long short-term memory RNNS for distant speech recognition. ICASSP 2016: 5755-5759 - [c13]Yanmin Qian, Tian Tan, Dong Yu, Yu Zhang:
Integrated adaptation with multi-factor joint-learning for far-field speech recognition. ICASSP 2016: 5770-5774 - [c12]Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass:
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition. INTERSPEECH 2016: 395-399 - [c11]Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu:
On training bi-directional neural network language model with noise contrastive estimation. ISCSLP 2016: 1-5 - [c10]Mitra Mohtarami, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Tao Lei, Kfir Bar, Scott Cyphers, James R. Glass:
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering. SemEval@NAACL-HLT 2016: 828-835 - [c9]Wei-Ning Hsu, Yu Zhang, James R. Glass:
A prioritized grid long short-term memory RNN for speech recognition. SLT 2016: 467-473 - [i5]Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu:
On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation. CoRR abs/1602.06064 (2016) - [i4]Wei-Ning Hsu, Yu Zhang, James R. Glass:
Recurrent Neural Network Encoder with Attention for Community Question Answering. CoRR abs/1603.07044 (2016) - [i3]William Chan, Yu Zhang, Quoc V. Le, Navdeep Jaitly:
Latent Sequence Decompositions. CoRR abs/1610.03035 (2016) - 2015
- [j1]Dong Yu, Kaisheng Yao, Yu Zhang:
The Computational Network Toolkit [Best of the Web]. IEEE Signal Process. Mag. 32(6): 123-126 (2015) - [c8]Yu Zhang, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Speech recognition with prediction-adaptation-correction recurrent neural networks. ICASSP 2015: 5004-5008 - [c7]Patrick Cardinal, Najim Dehak, Yu Zhang, James R. Glass:
Speaker adaptation using the i-vector technique for bottleneck features. INTERSPEECH 2015: 2867-2871 - [i2]Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway Long Short-Term Memory RNNs for Distant Speech Recognition. CoRR abs/1510.08983 (2015) - [i1]Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-Adaptation-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition. CoRR abs/1510.08985 (2015) - 2014
- [c6]Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Extracting deep neural network bottleneck features using low-rank matrix factorization. ICASSP 2014: 185-189 - [c5]Anne Cutler, Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Language ID-based training of multilingual stacked bottleneck features. INTERSPEECH 2014: 1-5 - [c4]Patrick Cardinal, Ahmed Ali, Najim Dehak, Yu Zhang, Tuka Al Hanai, Yifan Zhang, James R. Glass, Stephan Vogel:
Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera. INTERSPEECH 2014: 2088-2092 - [c3]Hung-yi Lee, Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages. INTERSPEECH 2014: 2479-2483 - [c2]Kaisheng Yao, Baolin Peng, Yu Zhang, Dong Yu, Geoffrey Zweig, Yangyang Shi:
Spoken language understanding using long short-term memory neural networks. SLT 2014: 189-194 - 2013
- [c1]Chia-ying Lee, Yu Zhang, James R. Glass:
Joint Learning of Phonetic Units and Word Pronunciations for ASR. EMNLP 2013: 182-192
Coauthor Index
aka: Nobu Morioka
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-02-02 22:31 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint