default search action

combined dblp search
author search
venue search
publication search

ask others

Yu Zhang 0033

> Home > Persons

Person information

affiliation: Google
affiliation (PhD 2017): Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA

Other persons with the same name

see FAQ

Yu Zhang — disambiguation page
Yu Zhang 0001 — Loughborough University, Department of Aeronautical and Automotive Engineering, UK (and 2 more)
Yu Zhang 0002 — Pennsylvania State University, University Park, PA, USA (and 2 more)
Yu Zhang 0003 — Hainan Normal University, Haikou, China
Yu Zhang 0004 — Southeast University, School of Computer Science and Engineering, Nanjing, China (and 1 more)
Yu Zhang 0005 — University of California, Santa Cruz, CA, USA (and 2 more)
Yu Zhang 0006 — Southern University of Science and Technology, Shenzhen, China (and 3 more)
Yu Zhang 0007 — Microsoft Research Asia, China
Yu Zhang 0008 — Zhejiang University, College of Computer Science, Hangzhou, China
Yu Zhang 0009 — Lehigh University, Department of Bioengineering, Bethlehem, PA, USA (and 2 more)

Yu Zhang 0010 — Northwestern University, Department of Chemistry & Center for Bio-inspired Energy Science (CBES), Evanston, IL, USA
Yu Zhang 0011 — South China Normal University, Guangzhou, China
Yu Zhang 0012 — Southeast University, National Mobile Communications Research Laboratory, Nanjing, China
Yu Zhang 0013 — Pennsylvania State University, College of Information Sciences and Technology, State College, PA, USA
Yu Zhang 0014 — Xidian University, State Key Laboratory of Integrated Services Networks, Xi'an, China
Yu Zhang 0015 — Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
Yu Zhang 0016 — Jilin University, College of Computer Science and Technology, Changchun, China
Yu Zhang 0017 — Sichuan University, West China Hospital, Department of Radiology, Chengdu, China
Yu Zhang 0018 — Zhejiang University, College of Control Science and Engineering, Hangzhou, China (and 3 more)
Yu Zhang 0019 — Wuhan University, Electronic and Information School, Signal Processing Lab, China
Yu Zhang 0020 — Tongji University, School of Mathematics, Shanghai, China
Yu Zhang 0021 — Guangdong Peizheng College, English Department, Guangzhou, China
Yu Zhang 0022 — University of California Santa Barbara, Department of Education, CA, USA
Yu Zhang 0023 — Hubei University for Nationalities, School of Information Engineering, Enshi, China
Yu Zhang 0024 — Northeastern University, Software College, Shenyang, China
Yu Zhang 0025 — Microsoft, Online Service Division, Sunnyvale, CA, USA (and 1 more)
Yu Zhang 0026 — Tsinghua University, Department of Electronic Engineering, Beijing, China
Yu Zhang 0027 — Huazhong University of Science and Technology, School of Computer Science and Technology, Service Computing Technology and System Lab, Cluster and Grid Computing Lab, Wuhan, China
Yu Zhang 0028 — Data Storage Institute, A-STAR, Singapore (and 1 more)
Yu Zhang 0029 — Chinese Academy of Sciences, Northwest Institute of Eco-Environment and Resources, Lanzhou, China
Yu Zhang 0030 — Harbin Institute of Technology, Research Center for Social Computing and Information Retrieval, China
Yu Zhang 0031 — Xinyang Normal University, Department of Computer Science and Technology, China
Yu Zhang 0032 — Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering, China
Yu Zhang 0034 — RMIT University, Melbourne, VIC, Australia (and 1 more)
Yu Zhang 0035 — SenseTime Group Limited, Beijing, China (and 1 more)
Yu Zhang 0036 — Harbin Institute of Technology, School of Computer Science and Technology, China
Yu Zhang 0037 — Beijing Institute of Technology, School of Mechanical Engineering, China
Yu Zhang 0038 — Harbin Engineering University, College of Information and Communication Engineering, China
Yu Zhang 0039 — Huazhong University of Science and Technology, School of Electronic Information and Communications, Wuhan, China
Yu Zhang 0040 — Shaanxi Normal University, MOE Key Laboratory of Modern Teaching Technology, Xi'an, China
Yu Zhang 0041 — University of North Carolina at Chapel Hill, University of North Carolina at Chapel Hill, NC, USA
Yu Zhang 0042 — State Grid Energy Research Institute Co., Ltd., Beijing, China (and 2 more)
Yu Zhang 0043 — University of Oxford, Department of Computer Science, UK (and 1 more)
Yu Zhang 0044 — Texas A&M University, Department of Computer Science & Engineering, TX, USA (and 1 more)
Yu Zhang 0045 — Harbin Institute of Technology, State Key Laboratory of Robotics and System, Harbin, China
Yu Zhang 0046 — Xiamen University, Key Laboratory of Underwater Acoustic Communication and Marine Information Technology, Xiamen, China
Yu Zhang 0047 — Tsinghua University, Department of Automation, Beijing, China
Yu Zhang 0048 — Northeast Electric Power University, College of Information Engineering, Jilin, China
Yu Zhang 0049 — Macquarie University, Australian School of Advanced Medicine, Sydney, Australia
Yu Zhang 0050 — Tsinghua University, Department of Electronic Engineering, Beijing, China (and 1 more)
Yu Zhang 0051 — Chemnitz University of Technology, Germany
Yu Zhang 0052 — École centrale de Lyon, France
Yu Zhang 0053 — Cardiff University, UK
Yu Zhang 0054 — Qualcomm Inc., Beijing, China (and 1 more)
Yu Zhang 0055 — Arizona State University, Computer Science and Engineering Department, Tempe, AZ, USA (and 1 more)
Yu Zhang 0056 — Anhui University, Information Materials and Intelligent Sensing Laboratory of Anhui Province, Hefei, China (and 1 more)
Yu Zhang 0057 — Air Force Engineering University, Aeronautics and Astronautics Engineering College, Xi'an, China
Yu Zhang 0058 — Chongqing University, College of Communication Engineering, China
Yu Zhang 0059 — University of California, Los Angeles, USA
Yu Zhang 0060 — National University of Defense Technology, Key Laboratory of Science and Technology on ATR, Changsha, China
Yu Zhang 0061 — University of Tokyo, Graduate School of Agricultural and Life Sciences, Japan
Yu Zhang 0062 — Xidian University, Video and Image Processing System Laboratory, China
Yu Zhang 0063 — Chinese University of Hong Kong, Electronic Engineering Department
Yu Zhang 0064 — Southern Medical University, School of Biomedical Engineering, Guangzhou, China
Yu Zhang 0065 — Yanshan University, Institute of Electrical Engineering, Qinhuangdao, China
Yu Zhang 0066 — China South Industries Group Corporation, Weapon Equipment Research Institute, Beijing, China
Yu Zhang 0067 — Liaoning Technical University, School of Science, Fuxin, China
Yu Zhang 0068 — Nanjing University of Aeronautics and Astronautics, Key Laboratory of Radar Imaging and Microwave Photonics, Nanjing, China
Yu Zhang 0069 — Chinese Academy of Sciences and Ministry of Water Resources, Institute of Soil and Water Conservation, Yangling, China
Yu Zhang 0070 — Shanghai Ocean University, College of Marine Sciences, China
Yu Zhang 0071 — Hainan University, State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, China
Yu Zhang 0072 — Fudan University, Shanghai Key Laboratory of Intelligent Information Processing, China
Yu Zhang 0073 — Northeastern University, Department of Systems Engineering, State Key Lab of Synthetic Automation of Process Industries, Shenyang, China
Yu Zhang 0074 — Pennsylvania State University, Department of Civil and Environmental Engineering, University Park, USA
Yu Zhang 0075 — Institute of High Performance Computing, Singapore (and 1 more)
Yu Zhang 0076 — Southwest Jiaotong University, School of Physical Science and Technology, Chengdu, China
Yu Zhang 0077 — Hangzhou Dianzi University, School of Electronics and Information, China (and 1 more)
Yu Zhang 0078 — Tianjin University, School of Precision Instrument and Opto-electronics Engineering, State Key Laboratory of Precision Measuring Technology and Instruments, China (and 1 more)
Yu Zhang 0079 — Beijing Institute of Technology, School of Information and Electronics, China (and 1 more)
Yu Zhang 0080 — Chinese Academy of Space Technology, Beijing Orient Institute of Measurement and Test, China (and 1 more)
Yu Zhang 0081 — University of Science and Technology Beijing, Donlinks School of Economics and Management, China (and 1 more)
Yu Zhang 0082 — Army Engineering University of PLA, College of Communication Engineering, Nanjing, China
Yu Zhang 0083 — Tsinghua University, Institute of Education, Beijing, China (and 1 more)
Yu Zhang 0084 — Nanyang Technological University, School of Computer Science and Engineering, Singapore
Yu Zhang 0085 — Southwest University, College of Computer and Information Science, Chongqing, China (and 1 more)
Yu Zhang 0086 — University of Science and Technology of China, Lab for Intelligent Networking and Knowledge Engineering, Hefei,China
Yu Zhang 0087 — University of South Florida, Department of Civil and Environmental Engineering, Tampa, FL, USA (and 1 more)
Yu Zhang 0088 — Delphi Automotive, Agoura Hills, CA, USA (and 1 more)
Yu Zhang 0089 — Jiangsu Normal University, School of Geography, Geomatics and Planning, Department of Land Resource Management, Xuzhou, China (and 1 more)
Yu Zhang 0090 — Beijing Normal University, College of Aritficial Intelligence, China
Yu Zhang 0091 — Harbin Institute of Technology, School of Astronautics, National Key Laboratory of Tunable Laser Technology, China
Yu Zhang 0092 — Soochow University, China
Yu Zhang 0093 — Macquarie University, Sydney, Australia (and 1 more)
Yu Zhang 0094 — University of Kentucky, Department of Computer Science, Lexington, KY, USA
Yu Zhang 0095 — Nankai University, College of Computer Science, Tianjin Key Laboratory of Network and Data Security Technology, Tianjin, China (and 1 more)
Yu Zhang 0096 (aka: Fiona Zhang) — Hong Kong Polytechnic University, Department of Mechanical Engineering, Hong Kong (and 2 more)
Yu Zhang 0097 — City University of Hong Kong, Hong Kong (and 2 more)
Yu Zhang 0098 — Fujian Jiangxia University, Department of Electronic Information Science, Fuzhou, China
Yu Zhang 0099 — University of California Davis, Department of Electrical and Computer Engineering, CA, USA
Yu Zhang 0100 — Chongqing University of Posts and Telecommunications, School of Economics and Management, China
Yu Zhang 0101 — Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, China
Yu Zhang 0102 — Chinese Academy of Sciences, Institute of Microelectronics, Beijing, China (and 1 more)
Yu Zhang 0103 — Nanjing University of Aeronautics and Astronautics, Department of Mathematics, State Key Laboratory of Mechanics and Control of Mechanical Structures, China
Yu Zhang 0104 — Chinese Academy of Sciences, Key Laboratory of Ecosystem Network Observation and Modeling, Beijing, China (and 1 more)
Yu Zhang 0105 — Information Engineering University Zhengzhou, College of Cryptographic Engineering, China
Yu Zhang 0106 — Changchun Institute of Technology, College of Computer Science and Engineering, China
Yu Zhang 0107 — Hunan University, College of Computer Science and Electronic Engineering, Changsha, China
Yu Zhang 0108 — East China University of Science and Technology, MOE Key Laboratory of Advanced Control and Optimization for Chemical Process, Shanghai, China (and 1 more)
Yu Zhang 0109 — Tongji University, MOE Key Laboratory of Road and Traffic Engineering, Shanghai, China
Yu Zhang 0110 — Jilin University, College of Electronic Science and Engineering, State Key Laboratory of Integrated Optoelectronics, Changchun, China
Yu Zhang 0111 — Southern University of Science and Technology, Department of Computer Science and Engineering, Guangdong Key Laboratory of Brain-Inspired Intelligent Computation, Shenzhen, China
Yu Zhang 0112 — East China Normal University, Department of Computer Science and Technology, Shanghai, China
Yu Zhang 0113 — Dalian Medical University, Second Affiliated Hospital, China
Yu Zhang 0114 — Catalonia Institute for Energy Research (IREC), Spain
Yu Zhang 0115 — University of Montreal, Department of Psychology, QC, Canada
Yu Zhang 0116 — Shenyang University of Technology, Department of Mechanical Engineering, China
Yu Zhang 0117 — Beijing University of Posts and Telecommunications, State Key Laboratory of Networking and Switching Technology, China (and 1 more)
Yu Zhang 0118 — Liaoning Technical University, School of Software, Fuxin, China
Yu Zhang 0119 — Shenzhen University, College of Life Sciences and Oceanography, Guangdong Engineering Research Center for Marine Algal Biotechnology, China
Yu Zhang 0120 — Shandong Normal University, School of Information Science and Engineering, China
Yu Zhang 0121 — Zhejiang Normal University, Institute of Precision Machinery and Smart Structure, College of Engineering, Jinhua, China
Yu Zhang 0122 — China Academy of Information and Communications Technology, Beijing, China
Yu Zhang 0123 — Arizona State University, School of Electrical, Computer and Energy Engineering, Tempe, AZ, USA
Yu Zhang 0124 — Southeast University, School of Computer Science and Engineering, Nanjing, China (and 2 more)
Yu Zhang 0125 — University of Macau, Faculty of Sciences and Technology, China (and 2 more)
Yu Zhang 0126 — Zhejiang University, Hangzhou, China
Yu Zhang 0127 — University of Münster, Germany
Yu Zhang 0128 — University of Sheffield, Department of Computer Science, UK
Yu Zhang 0129 — Nanjing Agricultural University, National Engineering and Technology Center for Information Agriculture, China
Yu Zhang 0130 — Chinese Academy of Sciences, Computer Network Information Center, Beijing, China
Yu Zhang 0131 — China University of Mining and Technology, Internet of Things (Perception Mine) Research Center, Xuzhou, China
Yu Zhang 0132 — Ohio State University, School of Earth Sciences, Columbus, OH, USA
Yu Zhang 0133 — Tongji University, Shanghai, China (and 2 more)
Yu Zhang 0134 — Southeast University, Department of Construction and Real Estate, Nanjing, China
Yu Zhang 0135 — Southeast University, School of Instrumentation Science and Engineering, Nanjing, China (and 1 more)
Yu Zhang 0136 — Harbin Institute of Technology, School of Architecture, China
Yu Zhang 0137 — China University of Petroleum (East China), College of Computer Science and Technology, Qingdao, China
Yu Zhang 0138 — Zhengzhou Normal University, School of Information Science and Technology, China (and 1 more)
Yu Zhang 0139 — Guilin University of Technology, College of Mechanical and Control Engineering, China
Yu Zhang 0140 — Shanghai Jiao Tong University, Institute of Oceanography, China
Yu Zhang 0141 — Shanghai Maritime University, College of Transport and Communications, China
Yu Zhang 0142 — State Grid Liaoning Electric Power Co., Ltd, Information and Communication Branch, China
Yu Zhang 0143 — Beijing University of Technology, Faculty of Information Technology, China
Yu Zhang 0144 — South China University of Technology, Guangzhou, China
Yu Zhang 0145 — State Grid Jiangsu Electric Power Co. Ltd., Electric Power Research Institute, China
Yu Zhang 0146 — Zhejiang University of Technology, College of Information Engineering, Hangzhou, China
Yu Zhang 0147 — Xidian University, Academy of Advanced Interdisciplinary Research, Xi'an, China
Yu Zhang 0148 — Iowa State University, Department of Electrical and Computer Engineering, Ames, IA, USA
Yu Zhang 0149 — Guangdong Academy of Medical Sciences, Department of Orthopaedics, Guangzhou, China (and 1 more)
Yu Zhang 0150 — Trinity University, Department of Computer Science, San Antonio, TX, USA (and 1 more)
Yu Zhang 0151 — Southern University of Science and Technology, First Affiliated Hospital, Second Clinical Medicine College of Jinan University, Shenzhen, China (and 1 more)
Yu Zhang 0152 — Tsinghua University, School of Aerospace Engineering, Beijing, China
Yu Zhang 0153 — Chongqing Jiaotong University, School of Economics and Management, China
Yu Zhang 0154 — Tsinghua University, Department of Electronic Engineering, Beijing, China
Yu Zhang 0155 — Wuhan University, GNSS Research Center, China
Yu Zhang 0156 — China University of Geosciences, School of Geophysics and Information Technology, Beijing, China
Yu Zhang 0157 — Xi'an Physical Education University, China
Yu Zhang 0158 — Zhongnan University of Economics and Law, Institute of Operations Management and System Engineering, School of Business Administration, Wuhan, China
Yu Zhang 0159 — Southwestern University of Finance and Economics, Research Institute of Economics and Management, Chengdu, China (and 1 more)
Yu Zhang 0160 — Shanxi Normal University, School of Geographical Sciences, Taiyuan, China
Yu Zhang 0161 — ETH Zurich, Engineering Design and Computing Lab, Switzerland
Yu Zhang 0162 — Anhui University of Science and Technology, School of Computer Science and Engineering, Huainan, China
Yu Zhang 0163 — Guangxi University of Finance and Economics, School of Management Science and Engineering, Nanning, China
Yu Zhang 0164 — Guangxi Normal University, College of Electronic Engineering, Guilin, China
Yu Zhang 0165 — Meituan, Shanghai, China (and 1 more)
Yu Zhang 0166 — Prometheus Vision Technology, Zhuhai, China
Yu Zhang 0167 — State Grid Gansu Electric Power Company Integrated Service Center, China
Yu Zhang 0168 — Beijing Institute of Satellite Environmental Engineering, China
Yu Zhang 0169 — Minzu University of China, China
Yu Zhang 0170 — Hebei University, School of Cyber Security and Computer, Key Laboratory on High Trusted Information System in Hebei Province, China
Yu Zhang 0171 — German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany (and 1 more)
Yu Zhang 0172 — Siemens (China) Co. Ltd., China
Yu Zhang 0173 — Shanghai Polytechnic University, School of Computer and Electronic Information, China
Yu Zhang 0174 — Dongfeng Nissan Passenger Vehicle Compnay, China
Yu Zhang 0175 — Beijing Institute of Technology, China
Yu Zhang 0176 — Alibaba Group, Beijing, China
Yu Zhang 0177 — University of Sydney, Faculty of Engineering, NSW, Australia
Yu Zhang 0178 — Xi'an University of Posts and Telecommunications, School of Communications and Information Engineering / School of Artificial Intelligence, China
Yu Zhang 0179 — Kunming University of Science and Technology, China
Yu Zhang 0180 — National Institute of Metrology, China
Yu Zhang 0181 — Dalian Maritime University, College of Information Science and Technology, China
Yu Zhang 0182 — Technical University of Munich, Department of Computation, Information and Technology, Garching, Germany
Yu Zhang 0183 — National Satellite Meteorological Center of China, Beijing, China (and 1 more)
Yu Zhang 0184 — Chongqing University, College of Computer Science, China
Yu Zhang 0185 — Tencent HealthCare, Shenzhen, China (and 1 more)
Yu Zhang 0186 — Central Conservatory of Music, Department of Music AI and Information Technology, Beijing, China
Yu Zhang 0187 — Department of Criminology, California State University, Fresno, CA, USA
Yu Zhang 0188 — Purdue University, West Lafayette, IN, USA

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c111]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/YangQCZZYY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/YangQCZZYY24
Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan L. Yuille, Jiahui Yu:
IG Captioner: Information Gain Captioners Are Strong Zero-Shot Classifiers. ECCV (64) 2024: 474-490
[c110]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuangACGHQ0WCS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuangACGHQ0WCS24
W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. ICASSP 2024: 13306-13310
[i97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-12789
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-12789
W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. CoRR abs/2401.12789 (2024)
2023
[c109]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/GaoMZC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/GaoMZC23
Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-Based Text To Speech. ASRU 2023: 1-8
[c108]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HuSLZCWZL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HuSLZCWZL23
Ke Hu, Tara N. Sainath, Bo Li, Yu Zhang, Yong Cheng, Tao Wang, Yujing Zhang, Frederick Liu:
Improving Multilingual and Code-Switching ASR Using Large Language Model Generated Text. ASRU 2023: 1-7
[c107]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/WangHSWCCCZSRZYPSSW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/WangHSWCCCZSRZYPSSW23
Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the Thin Gap Between Speech and Text Foundation Models. ASRU 2023: 1-8
[c106]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HuSLDHDZCCS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HuSLDHDZCCS23
Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. ICASSP 2023: 1-5
[c105]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HwangSZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HwangSZS23
Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-Scale ASR. ICASSP 2023: 1-5
[c104]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiHHBPSSZHSB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiHHBPSSZHSB23
Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. ICASSP 2023: 1-5
[c103]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/MengWPSCVZLRR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/MengWPSCVZLRR23
Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. ICASSP 2023: 1-5
[c102]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SaekiZCMWZBRR23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SaekiZCMWZBRR23
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5
[c101]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangCZZHH23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangCZZHH23
Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani:
Accelerating RNN-T Training and Inference Using CTC Guidance. ICASSP 2023: 1-5
[c100]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangKBCRRZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangKBCRRZ23
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. ICASSP 2023: 1-5
[c99]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangLZCPSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangLZCPSS23
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. ICASSP 2023: 1-5
[c98]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YangLZCSSL23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YangLZCSSL23
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. ICASSP 2023: 1-5
[c97]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/Cheng0JMB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/Cheng0JMB23
Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna:
Mu²SLAM: Multitask, Multilingual Speech and Language Models. ICML 2023: 5504-5520
[c96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenY00CCPLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenY00CCPLS23
Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023: 456-460
[c95]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Hu0S0B23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Hu0S0B23
Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Françoise Beaufays:
Mixture-of-Expert Conformer for Streaming Multilingual ASR. INTERSPEECH 2023: 3327-3331
[c94]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KoizumiZKDYMB0H23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KoizumiZKDYMB0H23
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. INTERSPEECH 2023: 5496-5500
[c93]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LeiBBALZ0ZWLZC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LeiBBALZ0ZWLZC23
Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. NeurIPS 2023
[c92]
- view
  authority control:
- export record
  dblp key:
  - conf/waspaa/KoizumiZKDYMZHBB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/waspaa/KoizumiZKDYMZHBB23
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. WASPAA 2023: 1-5
[i96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-07851
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-07851
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. CoRR abs/2301.07851 (2023)
[i95]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-01496
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-01496
Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. CoRR abs/2302.01496 (2023)
[i94]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08583
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08583
Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. CoRR abs/2302.08583 (2023)
[i93]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08917
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08917
Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. CoRR abs/2302.08917 (2023)
[i92]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01037
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01037
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023)
[i91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01664
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01664
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani:
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations. CoRR abs/2303.01664 (2023)
[i90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04947
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04947
Tao Lei, Junwen Bai, Siddhartha Brahma, Joshua Ainslie, Kenton Lee, Yanqi Zhou, Nan Du, Vincent Y. Zhao, Yuexin Wu, Bo Li, Yu Zhang, Ming-Wei Chang:
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference. CoRR abs/2304.04947 (2023)
[i89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-14514
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-14514
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. CoRR abs/2304.14514 (2023)
[i88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-15663
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-15663
Ke Hu, Bo Li, Tara N. Sainath, Yu Zhang, Françoise Beaufays:
Mixture-of-Expert Conformer for Streaming Multilingual ASR. CoRR abs/2305.15663 (2023)
[i87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18802
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18802
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna:
LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus. CoRR abs/2305.18802 (2023)
[i86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-01015
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-01015
Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? CoRR abs/2306.01015 (2023)
[i85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08131
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08131
Nanxin Chen, Izhak Shafran, Yu Zhang, Chung-Cheng Chiu, Hagen Soltau, James Qin, Yonghui Wu:
Efficient Adapters for Giant Speech Models. CoRR abs/2306.08131 (2023)
[i84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-12925
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-12925
Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara N. Sainath, Johan Schalkwyk, Matthew Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirovic, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats, Neil Zeghidour, Yu Zhang, Zhishuai Zhang, Lukas Zilka, Christian Havnø Frank:
AudioPaLM: A Large Language Model That Can Speak and Listen. CoRR abs/2306.12925 (2023)
[i83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-10567
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-10567
Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling For Spoken Language Identification. CoRR abs/2309.10567 (2023)
[i82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-00230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-00230
Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul K. Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu:
SLM: Bridge the thin gap between speech and text foundation models. CoRR abs/2310.00230 (2023)
[i81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-00945
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-00945
Yuan Gao, Nobuyuki Morioka, Yu Zhang, Nanxin Chen:
E3 TTS: Easy End-to-End Diffusion-based Text to Speech. CoRR abs/2311.00945 (2023)
[i80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-17072
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-17072
Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan L. Yuille, Jiahui Yu:
IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers. CoRR abs/2311.17072 (2023)
2022
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/BaskarRRZM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/BaskarRRZM22
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J. Sel. Top. Signal Process. 16(6): 1357-1366 (2022)
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/jstsp/ZhangPHQGSJXHWZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jstsp/ZhangPHQGSJXHWZ22
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE J. Sel. Top. Signal Process. 16(6): 1519-1532 (2022)
[c91]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShorJHPZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShorJHPZ22
Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using self-Supervised Conformers. ICASSP 2022: 3169-3173
[c90]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiPZSSHZFGP22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiPZSSHZFGP22
Bo Li, Ruoming Pang, Yu Zhang, Tara N. Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad:
Massively Multilingual ASR: A Lifelong Learning Solution. ICASSP 2022: 6397-6401
[c89]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/BaiLZBSSS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/BaiLZBSSS22
Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. ICASSP 2022: 6402-6406
[c88]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiZQHCW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiZQHCW22
Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. ICASSP 2022: 6537-6541
[c87]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChenZRRMW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChenZRRMW22
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681
[c86]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SainathHNBWQCPG22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SainathHNBWQCPG22
Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. ICASSP 2022: 8112-8116
[c85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiaDBC0CM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiaDBC0CM22
Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobu Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. INTERSPEECH 2022: 1721-1725
[c84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangFZL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangFZL22
Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. INTERSPEECH 2022: 2193-2197
[c83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BaskarRRZS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BaskarRRZS22
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Nicolás Serrano:
Reducing Domain mismatch in Self-supervised speech pre-training. INTERSPEECH 2022: 3028-3032
[c82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ConneauBZMPLCJR22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ConneauBZMPLCJR22
Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. INTERSPEECH 2022: 3248-3252
[c81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuWZHCH22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuWZHCH22
Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani:
Unsupervised Data Selection via Discrete Speech Representation for ASR. INTERSPEECH 2022: 3393-3397
[c80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZRRMBZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZRRMBZ22
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097
[c79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FinkelsteinZCCJ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FinkelsteinZCCJ22
Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. INTERSPEECH 2022: 4571-4575
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/SainathPBZHCLWS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/SainathPBZHCLWS22
Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model for ASR. SLT 2022: 52-59
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChenBRZRMC22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChenBRZRMC22
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75
[c76]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/MengCPZWAESRHVHM22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/MengCPZWAESRHVHM22
Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. SLT 2022: 197-204
[c75]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ConneauMKZADRRB22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ConneauMKZADRRB22
Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech. SLT 2022: 798-805
[c74]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HuangFHGWTZL22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HuangFHGWTZL22
Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings. SLT 2022: 1112-1119
[i79]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-01374
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-01374
Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau:
mSLAM: Massively multilingual joint pre-training for speech and text. CoRR abs/2202.01374 (2022)
[i78]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2202-12719
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2202-12719
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. CoRR abs/2202.12719 (2022)
[i77]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-10752
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-10752
Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. CoRR abs/2203.10752 (2022)
[i76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-13339
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-13339
Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka:
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation. CoRR abs/2203.13339 (2022)
[i75]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-16104
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-16104
Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. CoRR abs/2203.16104 (2022)
[i74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-03409
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-03409
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022)
[i73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-12446
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-12446
Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna:
FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech. CoRR abs/2205.12446 (2022)
[i72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-13183
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-13183
Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark:
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks. CoRR abs/2208.13183 (2022)
[i71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-05793
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-05793
Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR. CoRR abs/2210.05793 (2022)
[i70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-07353
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-07353
Tara N. Sainath, Rohit Prabhavalkar, Ankur Bapna, Yu Zhang, Zhouyuan Huo, Zhehuai Chen, Bo Li, Weiran Wang, Trevor Strohman:
JOIST: A Joint Speech and Text Streaming Model For ASR. CoRR abs/2210.07353 (2022)
[i69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-07978
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-07978
Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving generalizability of distilled self-supervised speech processing models under distorted settings. CoRR abs/2210.07978 (2022)
[i68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-10027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-10027
Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022)
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15447
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15447
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022)
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-15868
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-15868
Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding:
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation. CoRR abs/2210.15868 (2022)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-17049
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-17049
Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. CoRR abs/2210.17049 (2022)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-01263
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-01263
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. CoRR abs/2211.01263 (2022)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-09553
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-09553
Yong Cheng, Yu Zhang, Melvin Johnson, Wolfgang Macherey, Ankur Bapna:
Mu²SLAM: Multitask, Multilingual Speech and Language Models. CoRR abs/2212.09553 (2022)
2021
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChungZHCQPW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChungZHCQPW21
Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. ASRU 2021: 244-250
[c72]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChenZRRWM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChenZRRWM21
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258
[c71]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/LiPSGZQHHMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/LiPSGZQHHMB21
Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai:
Scaling End-to-End Models for Large-Scale Multilingual ASR. ASRU 2021: 1011-1018
[c70]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiGYSCNCPHQ0LZS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiGYSCNCPHQ0LZS21
Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster end-to-end Model for Streaming ASR. ICASSP 2021: 5634-5638
[c69]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShrivastavaGCZS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShrivastavaGCZS21
Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. ICASSP 2021: 5669-5673
[c68]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/EliasZSZJWW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/EliasZSZJWW21
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, Ron J. Weiss, Yonghui Wu:
Parallel Tacotron: Non-Autoregressive and Controllable TTS. ICASSP 2021: 5709-5713
[c67]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiQZLHWCS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiQZLHWCS21
Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman:
Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition. ICASSP 2021: 6388-6392
[c66]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/QiuLHZLCPBLHSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/QiuLHZLCPBLHSM21
David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence for Subword End-To-End ASR. ICASSP 2021: 6393-6397
[c65]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/DoutreHMLCPNMZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/DoutreHMLCPNMZC21
Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data. ICASSP 2021: 6558-6562
[c64]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ChenZZWNC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ChenZZWNC21
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. ICLR 2021
[c63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EliasZS0JSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EliasZS0JSW21
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. Interspeech 2021: 141-145
[c62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiaZSZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiaZSZW21
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. Interspeech 2021: 151-155
[c61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRZZGHEWRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRZZGHEWRM21
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740
[c60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuHZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuHZC21
Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models. Interspeech 2021: 3460-3464
[c59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZZW0DC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZZW0DC21
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. Interspeech 2021: 3765-3769
[c58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZLCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZLCW21
Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. Interspeech 2021: 4069-4073
[c57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiuHLZCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiuHLZCM21
David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. Interspeech 2021: 4074-4078
[c56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TjandraPZK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TjandraPZK21
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. Interspeech 2021: 4089-4093
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/ChiuNHPZJPSNCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/ChiuNHPZJPSNCW21
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. SLT 2021: 873-880
[i62]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2102-09114
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-09114
Harsh Shrivastava, Ankush Garg, Yuan Cao, Yu Zhang, Tara N. Sainath:
Echo State Speech Recognition. CoRR abs/2102.09114 (2021)
[i61]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-06716
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-06716
David Qiu, Qiujia Li, Yanzhang He, Yu Zhang, Bo Li, Liangliang Cao, Rohit Prabhavalkar, Deepti Bhatia, Wei Li, Ke Hu, Tara N. Sainath, Ian McGraw:
Learning Word-Level Confidence For Subword End-to-End ASR. CoRR abs/2103.06716 (2021)
[i60]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-14152
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-14152
Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. CoRR abs/2103.14152 (2021)
[i59]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-14574
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-14574
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. CoRR abs/2103.14574 (2021)
[i58]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2103-15060
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2103-15060
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. CoRR abs/2103.15060 (2021)
[i57]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-02133
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-02133
William Chan, Daniel S. Park, Chris A. Lee, Yu Zhang, Quoc V. Le, Mohammad Norouzi:
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network. CoRR abs/2104.02133 (2021)
[i56]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-02757
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-02757
Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models. CoRR abs/2104.02757 (2021)
[i55]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-12870
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-12870
David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. CoRR abs/2104.12870 (2021)
[i54]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2104-14830
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2104-14830
Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma:
Scaling End-to-End Models for Large-Scale Multilingual ASR. CoRR abs/2104.14830 (2021)
[i53]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2106-09660
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2106-09660
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. CoRR abs/2106.09660 (2021)
[i52]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-06209
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-06209
Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu:
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training. CoRR abs/2108.06209 (2021)
[i51]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-12226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-12226
Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021)
[i50]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-13226
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-13226
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2109.13226 (2021)
[i49]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-03327
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-03327
Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. CoRR abs/2110.03327 (2021)
[i48]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-04621
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-04621
Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers. CoRR abs/2110.04621 (2021)
[i47]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-10329
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-10329
Ankur Bapna, Yu-An Chung, Nan Wu, Anmol Gulati, Ye Jia, Jonathan H. Clark, Melvin Johnson, Jason Riesa, Alexis Conneau, Yu Zhang:
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training. CoRR abs/2110.10329 (2021)
[i46]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-08137
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-08137
Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. CoRR abs/2111.08137 (2021)
2020
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/SunKDCPTGZCCVHN20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/SunKDCPTGZCCVHN20
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov:
Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CVPR 2020: 2443-2451
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SainathHLNPBCLA20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SainathHLNPBCLA20
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency. ICASSP 2020: 6059-6063
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZWCZW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZWCZW20
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. ICASSP 2020: 6264-6268
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZWCZRRW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZWCZRRW20
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ParkZCCLCLW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ParkZCCLCLW20
Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu:
Specaugment on Large Scale Datasets. ICASSP 2020: 6879-6883
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WangRCZRWM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WangRCZRWM20
Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LuCZCF20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LuCZCF20
Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan:
Speech Sentiment Analysis via Pre-Trained Features from End-to-End ASR Models. ICASSP 2020: 7149-7153
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HayashiYIY0TTZT20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HayashiYIY0TTZT20
Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. ICASSP 2020: 7654-7658
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/WuLZAS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/WuLZAS20
Zelin Wu, Bo Li, Yu Zhang, Petar S. Aleksic, Tara N. Sainath:
Multistate Encoding with End-To-End Speech RNN Transducer Network. ICASSP 2020: 7819-7823
[c45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenR0WRM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenR0WRM20
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560
[c44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ParkZJHCLWL20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParkZJHCLWL20
Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le:
Improved Noisy Student Training for Automatic Speech Recognition. INTERSPEECH 2020: 2817-2821
[c43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangRCZRM20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangRCZRM20
Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836
[c42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanZZYCQGPW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanZZYCQGPW20
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. INTERSPEECH 2020: 3610-3614
[c41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GulatiQCPZYHWZW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GulatiQCPZYHWZW20
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. INTERSPEECH 2020: 5036-5040
[c40]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/lrec/ChenLXCZF20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lrec/ChenLXCZF20
Eric Chen, Zhiyun Lu, Hao Xu, Liangliang Cao, Yu Zhang, James Fan:
A Large Scale Speech Sentiment Corpus. LREC 2020: 6549-6555
[i45]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-03785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-03785
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. CoRR abs/2002.03785 (2020)
[i44]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2002-03788
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-03788
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020)
[i43]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-12710
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-12710
Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alexander Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirkó Visontai, Yonghui Wu, Yu Zhang, Ding Zhao:
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency. CoRR abs/2003.12710 (2020)
[i42]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-03191
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-03191
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu:
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context. CoRR abs/2005.03191 (2020)
[i41]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-03271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-03271
Chung-Cheng Chiu, Arun Narayanan, Wei Han, Rohit Prabhavalkar, Yu Zhang, Navdeep Jaitly, Ruoming Pang, Tara N. Sainath, Patrick Nguyen, Liangliang Cao, Yonghui Wu:
RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions. CoRR abs/2005.03271 (2020)
[i40]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-08100
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-08100
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang:
Conformer: Convolution-augmented Transformer for Speech Recognition. CoRR abs/2005.08100 (2020)
[i39]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2005-09629
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2005-09629
Daniel S. Park, Yu Zhang, Ye Jia, Wei Han, Chung-Cheng Chiu, Bo Li, Yonghui Wu, Quoc V. Le:
Improved Noisy Student Training for Automatic Speech Recognition. CoRR abs/2005.09629 (2020)
[i38]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2009-00713
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2009-00713
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan:
WaveGrad: Estimating Gradients for Waveform Generation. CoRR abs/2009.00713 (2020)
[i37]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-04301
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-04301
Jonathan Shen, Ye Jia, Mike Chrzanowski, Yu Zhang, Isaac Elias, Heiga Zen, Yonghui Wu:
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling. CoRR abs/2010.04301 (2020)
[i36]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-10504
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-10504
Yu Zhang, James Qin, Daniel S. Park, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Quoc V. Le, Yonghui Wu:
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2010.10504 (2020)
[i35]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-11428
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-11428
Qiujia Li, David Qiu, Yu Zhang, Bo Li, Yanzhang He, Philip C. Woodland, Liangliang Cao, Trevor Strohman:
Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition. CoRR abs/2010.11428 (2020)
[i34]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-11439
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-11439
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, Ron J. Weiss, Yonghui Wu:
Parallel Tacotron: Non-Autoregressive and Controllable TTS. CoRR abs/2010.11439 (2020)
[i33]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-12096
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-12096
Thibault Doutre, Wei Han, Min Ma, Zhiyun Lu, Chung-Cheng Chiu, Ruoming Pang, Arun Narayanan, Ananya Misra, Yu Zhang, Liangliang Cao:
Improving Streaming Automatic Speech Recognition With Non-Streaming Model Distillation On Unsupervised Data. CoRR abs/2010.12096 (2020)
[i32]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-12973
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-12973
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. CoRR abs/2010.12973 (2020)
[i31]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-10798
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-10798
Bo Li, Anmol Gulati, Jiahui Yu, Tara N. Sainath, Chung-Cheng Chiu, Arun Narayanan, Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu:
A Better and Faster End-to-End Model for Streaming ASR. CoRR abs/2011.10798 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/ChiuKPCSWHZPKNN19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/ChiuKPCSWHZPKNN19
Chung-Cheng Chiu, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang:
A Comparison of End-to-End Models for Long-Form Speech Recognition. ASRU 2019: 889-896
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/RosenbergZRJMWW19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/RosenbergZRJMWW19
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. ASRU 2019: 996-1002
[c37]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/corl/ZhouSZAGOGNV19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/corl/ZhouSZAGOGNV19
Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan:
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. CoRL 2019: 923-932
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LiZSWC19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LiZSWC19
Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes. ICASSP 2019: 5621-5625
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HsuZWCWWG19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HsuZWCWWG19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Yu-An Chung, Yuxuan Wang, Yonghui Wu, James R. Glass:
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization. ICASSP 2019: 5901-5905
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/HoriAHZWR19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/HoriAHZWR19
Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency Training for End-to-end Speech Recognition. ICASSP 2019: 6271-6275
[c33]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChungWHZS19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChungWHZS19
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis. ICASSP 2019: 6940-6944
[c32]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HsuZWZWWCJCSNP19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HsuZWZWWCJCSNP19
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. ICLR (Poster) 2019
[c31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZenDCZWJCW19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZenDCZWJCW19
Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu:
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. INTERSPEECH 2019: 1526-1530
[c30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWZWCSJRR19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWZWCSJRR19
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. INTERSPEECH 2019: 2080-2084
[c29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ParkCZCZCL19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParkCZCZCL19
Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le:
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. INTERSPEECH 2019: 2613-2617
[i30]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1902-08295
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-08295
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019)
[i29]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1904-02882
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-02882
Heiga Zen, Viet Dang, Rob Clark, Yu Zhang, Ron J. Weiss, Ye Jia, Zhifeng Chen, Yonghui Wu:
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech. CoRR abs/1904.02882 (2019)
[i28]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1904-08779
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1904-08779
Daniel S. Park, William Chan, Yu Zhang, Chung-Cheng Chiu, Barret Zoph, Ekin D. Cubuk, Quoc V. Le:
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition. CoRR abs/1904.08779 (2019)
[i27]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1907-04448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1907-04448
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. CoRR abs/1907.04448 (2019)
[i26]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1909-11699
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1909-11699
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. CoRR abs/1909.11699 (2019)
[i25]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-06528
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-06528
Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, Vijay Vasudevan:
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds. CoRR abs/1910.06528 (2019)
[i24]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-10909
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-10909
Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. CoRR abs/1910.10909 (2019)
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-01601
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-01601
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sébastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-François Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, Zhen-Hua Ling:
The ASVspoof 2019 database. CoRR abs/1911.01601 (2019)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-02242
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-02242
Chung-Cheng Chiu, Wei Han, Yu Zhang, Ruoming Pang, Sergey Kishchenko, Patrick Nguyen, Arun Narayanan, Hank Liao, Shuyuan Zhang, Anjuli Kannan, Rohit Prabhavalkar, Zhifeng Chen, Tara N. Sainath, Yonghui Wu:
A comparison of end-to-end models for long-form speech recognition. CoRR abs/1911.02242 (2019)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1911-09762
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1911-09762
Zhiyun Lu, Liangliang Cao, Yu Zhang, Chung-Cheng Chiu, James Fan:
Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models. CoRR abs/1911.09762 (2019)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1912-04838
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-04838
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov:
Scalability in Perception for Autonomous Driving: Waymo Open Dataset. CoRR abs/1912.04838 (2019)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1912-05533
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1912-05533
Daniel S. Park, Yu Zhang, Chung-Cheng Chiu, Youzheng Chen, Bo Li, William Chan, Quoc V. Le, Yonghui Wu:
SpecAugment on Large Scale Datasets. CoRR abs/1912.05533 (2019)
2018
[c28]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ShenPWSJYCZWRSA18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ShenPWSJYCZWRSA18
Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning Wavenet on MEL Spectrogram Predictions. ICASSP 2018: 4779-4783
[c27]
- view
  - electronic edition @ mlr.press (open access)
  - details & citations
- export record
  dblp key:
  - conf/icml/WangSZRBSXJRS18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/WangSZRBSXJRS18
Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Ye Jia, Fei Ren, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. ICML 2018: 5167-5176
[c26]
- view
- export record
  dblp key:
  - conf/nips/JiaZWWSRCNPLW18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/JiaZWWSRCNPLW18
Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. NeurIPS 2018: 4485-4495
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HayashiWZTHAT18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HayashiWZTHAT18
Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for end-to-end ASR. SLT 2018: 426-433
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1803-09017
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1803-09017
Yuxuan Wang, Daisy Stanton, Yu Zhang, R. J. Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous:
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis. CoRR abs/1803.09017 (2018)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1806-04558
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1806-04558
Ye Jia, Yu Zhang, Ron J. Weiss, Quan Wang, Jonathan Shen, Fei Ren, Zhifeng Chen, Patrick Nguyen, Ruoming Pang, Ignacio López-Moreno, Yonghui Wu:
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. CoRR abs/1806.04558 (2018)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1807-10893
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1807-10893
Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for End-to-End ASR. CoRR abs/1807.10893 (2018)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1808-10128
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1808-10128
Yu-An Chung, Yuxuan Wang, Wei-Ning Hsu, Yu Zhang, R. J. Skerry-Ryan:
Semi-Supervised Training for Improving Data Efficiency in End-to-End Speech Synthesis. CoRR abs/1808.10128 (2018)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-07217
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-07217
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang:
Hierarchical Generative Modeling for Controllable Speech Synthesis. CoRR abs/1810.07217 (2018)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-01690
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-01690
Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency training for end-to-end speech recognition. CoRR abs/1811.01690 (2018)
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1811-09021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1811-09021
Bo Li, Yu Zhang, Tara N. Sainath, Yonghui Wu, William Chan:
Bytes are All You Need: End-to-End Multilingual Speech Recognition and Synthesis with Bytes. CoRR abs/1811.09021 (2018)
2017
[b1]
- view
  - electronic edition via handle.net
  - details & citations
- export record
  dblp key:
  - phd/ndltd/Zhang17b
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/ndltd/Zhang17b
Yu Zhang:
Exploring neural network architectures for acoustic modeling. Massachusetts Institute of Technology, Cambridge, USA, 2017
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation. ASRU 2017: 16-23
[c23]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/ChanZLJ17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/ChanZLJ17
William Chan, Yu Zhang, Quoc V. Le, Navdeep Jaitly:
Latent Sequence Decompositions. ICLR (Poster) 2017
[c22]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HoriWZC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HoriWZC17
Takaaki Hori, Shinji Watanabe, Yu Zhang, William Chan:
Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM. INTERSPEECH 2017: 949-953
[c21]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. INTERSPEECH 2017: 1273-1277
[c20]
- view
- export record
  dblp key:
  - conf/nips/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. NIPS 2017: 1878-1889
[p3]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/XiaoWEMLHSCZY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/XiaoWEMLHSCZY17
Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael I. Mandel, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu:
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 79-104
[p2]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/ZhangYC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/ZhangYC17
Yu Zhang, Dong Yu, Guoguo Chen:
Advanced Recurrent Neural Networks for Automatic Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 261-279
[p1]
- view
  authority control:
- export record
  dblp key:
  - books/sp/17/ChenZY17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/books/sp/17/ChenZY17
Guoguo Chen, Yu Zhang, Dong Yu:
Sequence-Discriminative Training of Neural Networks. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 281-297
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG17
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Learning Latent Representations for Speech Generation and Transformation. CoRR abs/1704.04222 (2017)
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HoriWZC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HoriWZC17
Takaaki Hori, Shinji Watanabe, Yu Zhang, William Chan:
Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM. CoRR abs/1706.02737 (2017)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG17aa
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG17aa
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation. CoRR abs/1707.06265 (2017)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1709-02755
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-02755
Tao Lei, Yu Zhang, Yoav Artzi:
Training RNNs as Fast as CNNs. CoRR abs/1709.02755 (2017)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1709-07902
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1709-07902
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data. CoRR abs/1709.07902 (2017)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1712-05884
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1712-05884
Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, R. J. Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu:
Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. CoRR abs/1712.05884 (2017)
2016
[c19]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/coling/RomeoMBMBHZMG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/coling/RomeoMBMBHZMG16
Salvatore Romeo, Giovanni Da San Martino, Alberto Barrón-Cedeño, Alessandro Moschitti, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Mitra Mohtarami, James R. Glass:
Neural Attention for Learning to Rank Questions in Community Question Answering. COLING 2016: 1734-1745
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TanQYKLSXZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TanQYKLSXZ16
Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai Sim, Xiong Xiao, Yu Zhang:
Speaker-aware training of LSTM-RNNS for acoustic modelling. ICASSP 2016: 5280-5284
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ChuangsuwanichZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ChuangsuwanichZ16
Ekapol Chuangsuwanich, Yu Zhang, James R. Glass:
Multilingual data selection for training stacked bottleneck features. ICASSP 2016: 5410-5414
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangCGY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangCGY16
Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-adaptation-correction recurrent neural networks for low-resource language speech recognition. ICASSP 2016: 5415-5419
[c15]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/XiaoWELHSCZMY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/XiaoWELHSCZMY16
Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Michael I. Mandel, Dong Yu:
Deep beamforming networks for multi-channel speech recognition. ICASSP 2016: 5745-5749
[c14]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangCYYKG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangCYYKG16
Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway long short-term memory RNNS for distant speech recognition. ICASSP 2016: 5755-5759
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/QianTYZ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/QianTYZ16
Yanmin Qian, Tian Tan, Dong Yu, Yu Zhang:
Integrated adaptation with multi-factor joint-learning for far-field speech recognition. ICASSP 2016: 5770-5774
[c12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuZLG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuZLG16
Wei-Ning Hsu, Yu Zhang, Ann Lee, James R. Glass:
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition. INTERSPEECH 2016: 395-399
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/iscslp/HeZDY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iscslp/HeZDY16
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu:
On training bi-directional neural network language model with noise contrastive estimation. ISCSLP 2016: 1-5
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/semeval/MohtaramiBHZLBC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/semeval/MohtaramiBHZLBC16
Mitra Mohtarami, Yonatan Belinkov, Wei-Ning Hsu, Yu Zhang, Tao Lei, Kfir Bar, Scott Cyphers, James R. Glass:
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering. SemEval@NAACL-HLT 2016: 828-835
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/HsuZG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/HsuZG16
Wei-Ning Hsu, Yu Zhang, James R. Glass:
A prioritized grid long short-term memory RNN for speech recognition. SLT 2016: 467-473
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HeZDY16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HeZDY16
Tianxing He, Yu Zhang, Jasha Droppo, Kai Yu:
On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation. CoRR abs/1602.06064 (2016)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/HsuZG16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/HsuZG16
Wei-Ning Hsu, Yu Zhang, James R. Glass:
Recurrent Neural Network Encoder with Attention for Community Question Answering. CoRR abs/1603.07044 (2016)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ChanZLJ16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ChanZLJ16
William Chan, Yu Zhang, Quoc V. Le, Navdeep Jaitly:
Latent Sequence Decompositions. CoRR abs/1610.03035 (2016)
2015
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/spm/YuYZ15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/spm/YuYZ15
Dong Yu, Kaisheng Yao, Yu Zhang:
The Computational Network Toolkit [Best of the Web]. IEEE Signal Process. Mag. 32(6): 123-126 (2015)
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangYSD15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangYSD15
Yu Zhang, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Speech recognition with prediction-adaptation-correction recurrent neural networks. ICASSP 2015: 5004-5008
[c7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CardinalDZG15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CardinalDZG15
Patrick Cardinal, Najim Dehak, Yu Zhang, James R. Glass:
Speaker adaptation using the i-vector technique for bottleneck features. INTERSPEECH 2015: 2867-2871
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ZhangCYYKG15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ZhangCYYKG15
Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway Long Short-Term Memory RNNs for Distant Speech Recognition. CoRR abs/1510.08983 (2015)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ZhangCGY15
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ZhangCGY15
Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-Adaptation-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition. CoRR abs/1510.08985 (2015)
2014
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhangCG14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhangCG14
Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Extracting deep neural network bottleneck features using low-rank matrix factorization. ICASSP 2014: 185-189
[c5]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CutlerZCG14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CutlerZCG14
Anne Cutler, Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Language ID-based training of multilingual stacked bottleneck features. INTERSPEECH 2014: 1-5
[c4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CardinalADZHZGV14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CardinalADZHZGV14
Patrick Cardinal, Ahmed Ali, Najim Dehak, Yu Zhang, Tuka Al Hanai, Yifan Zhang, James R. Glass, Stephan Vogel:
Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera. INTERSPEECH 2014: 2088-2092
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeZCG14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeZCG14
Hung-yi Lee, Yu Zhang, Ekapol Chuangsuwanich, James R. Glass:
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages. INTERSPEECH 2014: 2479-2483
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/slt/YaoPZYZS14
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/slt/YaoPZYZS14
Kaisheng Yao, Baolin Peng, Yu Zhang, Dong Yu, Geoffrey Zweig, Yangyang Shi:
Spoken language understanding using long short-term memory neural networks. SLT 2014: 189-194
2013
[c1]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/emnlp/LeeZG13
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/LeeZG13
Chia-ying Lee, Yu Zhang, James R. Glass:
Joint Learning of Phonetic Units and Word Pronunciations for ASR. EMNLP 2013: 182-192

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.