default search action
Yao Qian
This is just a disambiguation page, and is not intended to be the bibliography of an actual person. Any publication listed on this page has not been assigned to an actual author yet. If you know the true author of one of the publications listed below, you are welcome to contact us.
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j21]Jiacheng Wu, Gang Liu, Xiao Wang, Haojie Tang, Yao Qian:
GAN-GA: infrared and visible image fusion generative adversarial network based on global awareness. Appl. Intell. 54(13-14): 7296-7316 (2024) - [j20]Kaixin Li, Gang Liu, Xinjie Gu, Haojie Tang, Jinxin Xiong, Yao Qian:
DANT-GAN: A dual attention-based of nested training network for infrared and visible image fusion. Digit. Signal Process. 145: 104316 (2024) - [j19]Rui Chang, Gang Liu, Haojie Tang, Yao Qian, Jianchao Tang:
RDGMEF: a multi-exposure image fusion framework based on Retinex decompostion and guided filter. Neural Comput. Appl. 36(20): 12083-12102 (2024) - [j18]Mengliang Xing, Gang Liu, Haojie Tang, Yao Qian, Jun Zhang:
CFNet: An infrared and visible image compression fusion network. Pattern Recognit. 156: 110774 (2024) - [j17]Haojie Tang, Gang Liu, Yao Qian, Jiebang Wang, Jinxin Xiong:
EgeFusion: Towards Edge Gradient Enhancement in Infrared and Visible Image Fusion With Multi-Scale Transform. IEEE Trans. Computational Imaging 10: 385-398 (2024) - [c107]Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng:
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition. ICASSP 2024: 11046-11050 - [c106]Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Xuemei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang:
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data. NAACL-HLT (Findings) 2024: 1615-1627 - [i30]Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng:
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations. CoRR abs/2404.06690 (2024) - [i29]Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Sheng Zhao, Michael Zeng:
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation. CoRR abs/2405.17809 (2024) - [i28]Sanyuan Chen, Shujie Liu, Long Zhou, Yanqing Liu, Xu Tan, Jinyu Li, Sheng Zhao, Yao Qian, Furu Wei:
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers. CoRR abs/2406.05370 (2024) - [i27]Jiaqi Li, Dongmei Wang, Xiaofei Wang, Yao Qian, Long Zhou, Shujie Liu, Midia Yousefi, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yanqing Liu, Junkun Chen, Sheng Zhao, Jinyu Li, Zhizheng Wu, Michael Zeng:
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation. CoRR abs/2409.04016 (2024) - 2023
- [c105]Ziyi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang:
i-Code: An Integrative and Composable Multimodal Learning Framework. AAAI 2023: 10880-10890 - [c104]Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-Modality Clues. ICASSP 2023: 1-5 - [c103]Heming Wang, Yao Qian, Hemin Yang, Nauyuki Kanda, Peidong Wang, Takuya Yoshioka, Xiaofei Wang, Yiming Wang, Shujie Liu, Zhuo Chen, DeLiang Wang, Michael Zeng:
DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks. ICASSP 2023: 1-5 - [c102]Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. ICASSP 2023: 1-5 - [c101]Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng:
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers. INTERSPEECH 2023: 1314-1318 - [c100]Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang:
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation. NeurIPS 2023 - [i26]Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-modality Clues. CoRR abs/2303.08372 (2023) - [i25]Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. CoRR abs/2303.10949 (2023) - [i24]Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang:
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data. CoRR abs/2305.12311 (2023) - [i23]Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang:
i-Code Studio: A Configurable and Composable Framework for Integrative AI. CoRR abs/2305.13738 (2023) - [i22]Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, Xuedong Huang:
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation. CoRR abs/2305.14838 (2023) - [i21]Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng:
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers. CoRR abs/2305.18747 (2023) - [i20]Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Xinkai Wang, Hemin Yang, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng:
Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction. CoRR abs/2309.13874 (2023) - 2022
- [j16]Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Xiangzhan Yu, Furu Wei:
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. IEEE J. Sel. Top. Signal Process. 16(6): 1505-1518 (2022) - [c99]Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei:
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing. ACL (1) 2022: 5723-5738 - [c98]Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang:
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction. ICASSP 2022: 6062-6066 - [c97]Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng:
Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification. ICASSP 2022: 6147-6151 - [c96]Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu:
Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training. ICASSP 2022: 6152-6156 - [c95]Chengyi Wang, Yu Wu, Sanyuan Chen, Shujie Liu, Jinyu Li, Yao Qian, Zhenglu Yang:
Improving Self-Supervised Learning for Speech Recognition with Intermediate Layer Supervision. ICASSP 2022: 7092-7096 - [c94]Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu:
Wav2vec-Switch: Contrastive Learning from Original-Noisy Speech Pairs for Robust Speech Recognition. ICASSP 2022: 7097-7101 - [c93]Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng:
Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding. ICASSP 2022: 7802-7806 - [c92]Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data. INTERSPEECH 2022: 2658-2662 - [c91]Zhengyang Chen, Yao Qian, Bing Han, Yanmin Qian, Michael Zeng:
A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning. SLT 2022: 599-604 - [i19]Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data. CoRR abs/2203.17113 (2022) - [i18]Ziyi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang:
i-Code: An Integrative and Composable Multimodal Learning Framework. CoRR abs/2205.01818 (2022) - [i17]Mostafa Karimi, Changliang Liu, Ken'ichi Kumatani, Yao Qian, Tianyu Wu, Jian Wu:
Deploying self-supervised learning in the wild for hybrid automatic speech recognition. CoRR abs/2205.08598 (2022) - [i16]Gang Liu, Tianyan Zhou, Yong Zhao, Yu Wu, Zhuo Chen, Yao Qian, Jian Wu:
The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2209.11266 (2022) - [i15]Zhengyang Chen, Yao Qian, Bing Han, Yanmin Qian, Michael Zeng:
A comprehensive study on self-supervised distillation for speaker representation learning. CoRR abs/2210.15936 (2022) - 2021
- [c90]Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng:
Speech-Language Pre-Training for End-to-End Spoken Language Understanding. ICASSP 2021: 7458-7462 - [c89]Chengyi Wang, Yu Wu, Yao Qian, Ken'ichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang:
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data. ICML 2021: 10937-10947 - [c88]Ying Qin, Yao Qian, Anastassia Loukina, Patrick L. Lange, Abhinav Misra, Keelan Evanini, Tan Lee:
Automatic Detection of Word-Level Reading Errors in Non-native English Speech Based on ASR Output. ISCSLP 2021: 1-5 - [c87]Xinhao Wang, Keelan Evanini, Yao Qian, Matthew Mulholland:
Automated Scoring of Spontaneous Speech from Young Learners of English Using Transformers. SLT 2021: 705-712 - [i14]Chengyi Wang, Yu Wu, Yao Qian, Ken'ichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang:
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data. CoRR abs/2101.07597 (2021) - [i13]Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng:
Speech-language Pre-training for End-to-end Spoken Language Understanding. CoRR abs/2102.06283 (2021) - [i12]Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu:
Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition. CoRR abs/2110.04934 (2021) - [i11]Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu:
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training. CoRR abs/2110.05752 (2021) - [i10]Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng:
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification. CoRR abs/2110.05777 (2021) - [i9]Junyi Ao, Rui Wang, Long Zhou, Shujie Liu, Shuo Ren, Yu Wu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei:
SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing. CoRR abs/2110.07205 (2021) - [i8]Rimita Lahiri, Ken'ichi Kumatani, Eric Sun, Yao Qian:
Multilingual Speech Recognition using Knowledge Transfer across Learning Processes. CoRR abs/2110.07909 (2021) - [i7]Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng:
Optimizing Alignment of Speech and Language Latent Spaces for End-to-End Speech Recognition and Understanding. CoRR abs/2110.12138 (2021) - [i6]Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei:
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. CoRR abs/2110.13900 (2021) - [i5]Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang:
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction. CoRR abs/2110.15430 (2021) - [i4]Chengyi Wang, Yu Wu, Sanyuan Chen, Shujie Liu, Jinyu Li, Yao Qian, Zhenglu Yang:
Self-Supervised Learning for speech recognition with Intermediate layer supervision. CoRR abs/2112.08778 (2021) - 2020
- [j15]Yao Qian, Rutuja Ubale, Patrick L. Lange, Keelan Evanini, Vikram Ramanarayanan, Frank K. Soong:
Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications. J. Signal Process. Syst. 92(8): 805-817 (2020) - [c86]Yao Qian, Yu Shi, Michael Zeng:
Discriminative Transfer Learning for Optimizing ASR and Semantic Labeling in Task-Oriented Spoken Dialog. INTERSPEECH 2020: 3915-3919
2010 – 2019
- 2019
- [j14]Peng Cao, Yao Qian, Pan Xue, Danzhu Lu, Jie He, Zhiliang Hong:
A Bipolar-Input Thermoelectric Energy-Harvesting Interface With Boost/Flyback Hybrid Converter and On-Chip Cold Starter. IEEE J. Solid State Circuits 54(12): 3362-3374 (2019) - [c85]Rutuja Ubale, Vikram Ramanarayanan, Yao Qian, Keelan Evanini, Chee Wee Leong, Chong Min Lee:
Native Language Identification from Raw Waveforms Using Deep Convolutional Neural Networks with Attentive Pooling. ASRU 2019: 403-410 - [c84]Xinhao Wang, Keelan Evanini, Yao Qian, Klaus Zechner:
Using Very Deep Convolutional Neural Networks to Automatically Detect Plagiarized Spoken Responses. ASRU 2019: 764-771 - [c83]Xinhao Wang, Keelan Evanini, Matthew Mulholland, Yao Qian, James V. Bruno:
Application of an Automatic Plagiarism Detection System in a Large-scale Assessment of English Speaking Proficiency. BEA@ACL 2019: 435-443 - [c82]Yao Qian, Patrick L. Lange, Keelan Evanini, Robert A. Pugh, Rutuja Ubale, Matthew Mulholland, Xinhao Wang:
Neural Approaches to Automated Speech Scoring of Monologue and Dialogue Responses. ICASSP 2019: 8112-8116 - [c81]Chee Wee Leong, Katrina Roohr, Vikram Ramanarayanan, Michelle P. Martin-Raugh, Harrison Kell, Rutuja Ubale, Yao Qian, Zydrune Mladineo, Laura McCulla:
Are Humans Biased in Assessment of Video Interviews? ICMI (Adjunct) 2019: 9:1-9:5 - [c80]Anastassia Loukina, Beata Beigman Klebanov, Patrick L. Lange, Yao Qian, Binod Gyawali, Nitin Madnani, Abhinav Misra, Klaus Zechner, Zuowei Wang, John Sabatini:
Automated Estimation of Oral Reading Fluency During Summer Camp e-Book Reading with MyTurnToRead. INTERSPEECH 2019: 21-25 - [c79]Xinhao Wang, Su-Youn Yoon, Keelan Evanini, Klaus Zechner, Yao Qian:
Automatic Detection of Off-Topic Spoken Responses Using Very Deep Convolutional Neural Networks. INTERSPEECH 2019: 4200-4204 - [c78]Peng Cao, Yao Qian, Pan Xue, Danzhu Lu, Jie He, Zhiliang Hong:
An 84% Peak Efficiency Bipolar-Input Boost/Flyback Hybrid Converter With MPPT and on-Chip Cold Starter for Thermoelectric Energy Harvesting. ISSCC 2019: 420-422 - [c77]Vikram Ramanarayanan, Matthew Mulholland, Yao Qian:
Scoring Interactional Aspects of Human-Machine Dialog for Language Learning and Assessment using Text Features. SIGdial 2019: 103-109 - [i3]Chee Wee Leong, Katrina Roohr, Vikram Ramanarayanan, Michelle P. Martin-Raugh, Harrison Kell, Rutuja Ubale, Yao Qian, Zydrune Mladineo, Laura McCulla:
To Trust, or Not to Trust? A Study of Human Bias in Automated Video Interview Assessments. CoRR abs/1911.13248 (2019) - 2018
- [j13]Yao Qian, Danzhu Lu, Jie He, Zhiliang Hong:
An On-Chip Transformer-Based Self-Startup Hybrid SIDITO Converter for Thermoelectric Energy Harvesting. IEEE Trans. Circuits Syst. II Express Briefs 65-II(11): 1673-1677 (2018) - [c76]Lei Chen, Jidong Tao, Shabnam Ghaffarzadegan, Yao Qian:
End-to-End Neural Network Based Automated Speech Scoring. ICASSP 2018: 6234-6238 - [c75]Keelan Evanini, Matthew Mulholland, Rutuja Ubale, Yao Qian, Robert A. Pugh, Vikram Ramanarayanan, Aoife Cahill:
Improvements to an Automated Content Scoring System for Spoken CALL Responses: the ETS Submission to the Second Spoken CALL Shared Task. INTERSPEECH 2018: 2379-2383 - [c74]Zhaoheng Ni, Rutuja Ubale, Yao Qian, Michael I. Mandel, Su-Youn Yoon, Abhinav Misra, David Suendermann-Oeft:
Unusable Spoken Response Detection with BLSTM Neural Networks. ISCSLP 2018: 255-259 - [c73]Yao Qian, Rutuja Ubale, Patrick L. Lange, Keelan Evanini, Frank K. Soong:
From Speech Signals to Semantics - Tagging Performance at Acoustic, Phonetic and Word Levels. ISCSLP 2018: 280-284 - [c72]Vikram Ramanarayanan, Robert Pugh, Yao Qian, David Suendermann-Oeft:
Automatic Turn-Level Language Identification for Code-Switched Spanish-English Dialog. IWSDS 2018: 51-61 - [c71]Rutuja Ubale, Yao Qian, Keelan Evanini:
Exploring End-To-End Attention-Based Neural Networks For Native Language Identification. SLT 2018: 84-91 - [c70]Yao Qian, Rutuja Ubale, Matthew Mulholland, Keelan Evanini, Xinhao Wang:
A Prompt-Aware Neural Network Approach to Content-Based Scoring of Non-Native Spontaneous Speech. SLT 2018: 979-986 - 2017
- [j12]Yao Qian, Hongguang Zhang, Yanqin Chen, Yajie Qin, Danzhu Lu, Zhiliang Hong:
A SIDIDO DC-DC Converter With Dual-Mode and Programmable-Capacitor-Array MPPT Control for Thermoelectric Energy Harvesting. IEEE Trans. Circuits Syst. II Express Briefs 64-II(8): 952-956 (2017) - [c69]Yao Qian, Rutuja Ubale, Vikram Ramanarayanan, Patrick L. Lange, David Suendermann-Oeft, Keelan Evanini, Eugene Tsuprun:
Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system. ASRU 2017: 569-576 - [c68]Yao Qian, Keelan Evanini, Patrick L. Lange, Robert A. Pugh, Rutuja Ubale, Frank K. Soong:
Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora. ASRU 2017: 606-613 - [c67]Shervin Malmasi, Keelan Evanini, Aoife Cahill, Joel R. Tetreault, Robert A. Pugh, Christopher Hamill, Diane Napolitano, Yao Qian:
A Report on the 2017 Native Language Identification Shared Task. BEA@EMNLP 2017: 62-75 - [c66]Yao Qian, Keelan Evanini, Xinhao Wang, Chong Min Lee, Matthew Mulholland:
Bidirectional LSTM-RNN for Improving Automated Assessment of Non-Native Children's Speech. INTERSPEECH 2017: 1417-1421 - [c65]Yao Qian, Keelan Evanini, Xinhao Wang, David Suendermann-Oeft, Robert A. Pugh, Patrick L. Lange, Hillary R. Molloy, Frank K. Soong:
Improving Sub-Phone Modeling for Better Native Language Identification with Non-Native English Speech. INTERSPEECH 2017: 2586-2590 - [c64]Keelan Evanini, Matthew Mulholland, Eugene Tsuprun, Yao Qian:
Using an Automated Content Scoring Engine for Spoken CALL Responses: The ETS submission for the Spoken CALL Challenge. SLaTE 2017: 97-102 - [c63]Anastassia Loukina, Beata Beigman Klebanov, Patrick L. Lange, Binod Gyawali, Yao Qian:
Developing speech processing technologies for shared book reading with a computer. WOCCI 2017: 46-51 - 2016
- [j11]Xiang Yin, Ming Lei, Yao Qian, Frank K. Soong, Lei He, Zhen-Hua Ling, Li-Rong Dai:
Modeling F0 trajectories in hierarchically structured deep neural networks. Speech Commun. 76: 82-92 (2016) - [c62]Yuchen Fan, Yao Qian, Frank K. Soong, Lei He:
Unsupervised speaker adaptation for DNN-based TTS synthesis. ICASSP 2016: 5135-5139 - [c61]Yuchen Fan, Yao Qian, Frank K. Soong, Lei He:
Speaker and language factorization in DNN-based TTS synthesis. ICASSP 2016: 5540-5544 - [c60]Matthew Mulholland, Melissa Lopez, Keelan Evanini, Anastassia Loukina, Yao Qian:
A comparison of ASR and human errors for transcription of non-native spontaneous speech. ICASSP 2016: 5855-5859 - [c59]Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft:
Self-Adaptive DNN for Improving Spoken Language Proficiency Assessment. INTERSPEECH 2016: 3122-3126 - [c58]Yao Qian, Jidong Tao, David Suendermann-Oeft, Keelan Evanini, Alexei V. Ivanov, Vikram Ramanarayanan:
Noise and Metadata Sensitive Bottleneck Features for Improving Speaker Recognition with Non-Native Speech Input. INTERSPEECH 2016: 3648-3652 - [c57]Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao:
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network. HLT-NAACL 2016: 527-533 - [c56]Yao Qian, Xinhao Wang, Keelan Evanini, David Suendermann-Oeft:
Improving DNN-Based Automatic Recognition of Non-native Children Speech with Adult Speech. WOCCI 2016: 40-44 - 2015
- [j10]Wenping Hu, Yao Qian, Frank K. Soong, Yong Wang:
Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers. Speech Commun. 67: 154-166 (2015) - [c55]Zhou Yu, Vikram Ramanarayanan, David Suendermann-Oeft, Xinhao Wang, Klaus Zechner, Lei Chen, Jidong Tao, Aliaksei Ivanou, Yao Qian:
Using bidirectional lstm recurrent neural networks to learn high-level abstractions of sequential features for automated scoring of non-native spontaneous speech. ASRU 2015: 338-345 - [c54]Yuchen Fan, Yao Qian, Frank K. Soong, Lei He:
Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis. ICASSP 2015: 4475-4479 - [c53]Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao:
Word embedding for recurrent neural network based TTS synthesis. ICASSP 2015: 4879-4883 - [c52]Yuchen Fan, Yao Qian, Frank K. Soong, Lei He:
Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis. INTERSPEECH 2015: 864-868 - [c51]Wenping Hu, Yao Qian, Frank K. Soong:
An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners' speech. SLaTE 2015: 71-76 - [i2]Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao:
Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network. CoRR abs/1510.06168 (2015) - [i1]Peilu Wang, Yao Qian, Frank K. Soong, Lei He, Hai Zhao:
A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding. CoRR abs/1511.00215 (2015) - 2014
- [j9]Weixun Gao, Qiying Cao, Yao Qian:
Cross-Dialectal Voice Conversion with Neural Networks. IEICE Trans. Inf. Syst. 97-D(11): 2872-2880 (2014) - [c50]Wenping Hu, Yao Qian, Frank K. Soong:
A DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training. ICASSP 2014: 3206-3210 - [c49]Yao Qian, Yuchen Fan, Wenping Hu, Frank K. Soong:
On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis. ICASSP 2014: 3829-3833 - [c48]Yuchen Fan, Yao Qian, Feng-Long Xie, Frank K. Soong:
TTS synthesis with bidirectional LSTM based recurrent neural networks. INTERSPEECH 2014: 1964-1968 - [c47]Xiang Yin, Ming Lei, Yao Qian, Frank K. Soong, Lei He, Zhen-Hua Ling, Li-Rong Dai:
Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree. INTERSPEECH 2014: 2273-2277 - [c46]Feng-Long Xie, Yao Qian, Yuchen Fan, Frank K. Soong, Haifeng Li:
Sequence error (SE) minimization training of neural network for voice conversion. INTERSPEECH 2014: 2283-2287 - [c45]Feng-Long Xie, Yao Qian, Frank K. Soong, Haifeng Li:
Pitch transformation in neural network based voice conversion. ISCSLP 2014: 197-200 - [c44]Wenping Hu, Yao Qian, Frank K. Soong:
A new Neural Network based logistic regression classifier for improving mispronunciation detection of L2 language learners. ISCSLP 2014: 245-249 - [c43]Danzhu Lu, Yao Qian, Zhiliang Hong:
4.3 An 87%-peak-efficiency DVS-capable single-inductor 4-output DC-DC buck converter with ripple-based adaptive off-time control. ISSCC 2014: 82-83 - [c42]Changqin Quan, Yao Qian, Fuji Ren:
Dynamic facial expression recognition based on K-order emotional intensity model. ROBIO 2014: 1164-1168 - 2013
- [j8]Yao Qian, Frank K. Soong, Zhi-Jie Yan:
A Unified Trajectory Tiling Approach to High Quality Speech Rendering. IEEE Trans. Speech Audio Process. 21(2): 280-290 (2013) - [c41]Qiuli Li, Yao Qian, Danzhu Lu, Zhiliang Hong:
VCCS controlled LDO with small on-chip capacitor. ASICON 2013: 1-4 - [c40]Yao Qian, Frank K. Soong, Xiaobo Zhou, Yundi Qian, Xiaotian Zhang:
A fast table lookup based, statistical model driven non-uniform unit selection TTS. ICASSP 2013: 7957-7961 - [c39]Wenping Hu, Yao Qian, Frank K. Soong:
A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL). INTERSPEECH 2013: 1886-1890 - [c38]Yao Qian, Fuji Ren, Changqin Quan:
A new preprocessing algorithm and local binary pattern based facial expression recognition. SII 2013: 239-244 - 2012
- [j7]Lijuan Wang, Yao Qian, Matthew R. Scott, Gang Chen, Frank K. Soong:
Computer-Assisted Audiovisual Language Learning. Computer 45(6): 38-47 (2012) - [c37]Ji He, Yao Qian, Frank K. Soong, Sheng Zhao:
Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS. INTERSPEECH 2012: 963-966 - [c36]Yao Qian, Frank K. Soong:
A unified trajectory tiling approach to high quality TTS and cross-lingual voice transformation. ISCSLP 2012: 165-169 - [c35]Xiaotian Zhang, Yao Qian, Hai Zhao, Frank K. Soong:
Break index labeling of mandarin text via syntactic-to-prosodic tree mapping. ISCSLP 2012: 256-260 - [c34]Wenping Hu, Yao Qian, Frank K. Soong:
Pitch accent detection and prediction with DCT features and CRF model. ISCSLP 2012: 266-270 - [c33]Darren Edge, Kai-Yin Cheng, Michael Whitney, Yao Qian, Zhijie Yan, Frank K. Soong:
Tip tap tones: mobile microtraining of mandarin sounds. Mobile HCI (Companion) 2012: 215-216 - [c32]Darren Edge, Kai-Yin Cheng, Michael Whitney, Yao Qian, Zhijie Yan, Frank K. Soong:
Tip tap tones: mobile microtraining of mandarin sounds. Mobile HCI 2012: 427-430 - 2011
- [j6]Yao Qian, Zhizheng Wu, Boyang Gao, Frank K. Soong:
Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units. IEEE Trans. Speech Audio Process. 19(6): 1702-1710 (2011) - [c31]Aki Kunikoshi, Yao Qian, Frank K. Soong, Nobuaki Minematsu:
Improved F0 modeling and generation in voice conversion. ICASSP 2011: 4568-4571 - [c30]Yao Qian, Ji Xu, Frank K. Soong:
A frame mapping based HMM approach to cross-lingual voice transformation. ICASSP 2011: 5120-5123 - [c29]Bo Peng, Yao Qian, Frank K. Soong, Bo Zhang:
A New Phonetic Candidate Generator for Improving Search Query Efficiency. INTERSPEECH 2011: 1117-1120 - 2010
- [c28]Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Guoliang Zhang, Lijuan Wang:
An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010. Blizzard Challenge 2010 - [c27]Qingqing Zhang, Frank K. Soong, Yao Qian, Zhijie Yan, Jielin Pan, Yonghong Yan:
Improved modeling for F0 generation and V/U decision in HMM-based TTS. ICASSP 2010: 4606-4609 - [c26]Zhi-Jie Yan, Yao Qian, Frank K. Soong:
RIch-context Unit Selection (RUS) approach to high quality TTS. ICASSP 2010: 4798-4801 - [c25]Yao Qian, Zhi-Jie Yan, Yi-Jian Wu, Frank K. Soong, Xin Zhuang, Shengyi Kong:
An HMM trajectory tiling (HTT) approach to high quality TTS. INTERSPEECH 2010: 422-425 - [c24]Xin Zhuang, Yao Qian, Frank K. Soong, Yi-Jian Wu, Bo Zhang:
Formant-based frequency warping for improving speaker adaptation in HMM TTS. INTERSPEECH 2010: 817-820 - [c23]Yao Qian, Zhizheng Wu, Xuezhe Ma, Frank K. Soong:
Automatic prosody prediction and detection with Conditional Random Field (CRF) models. ISCSLP 2010: 135-138
2000 – 2009
- 2009
- [j5]Yao Qian, Frank K. Soong:
A Multi-Space Distribution (MSD) and two-stream tone modeling approach to Mandarin speech recognition. Speech Commun. 51(12): 1169-1179 (2009) - [j4]Yao Qian, Hui Liang, Frank K. Soong:
A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS. IEEE Trans. Speech Audio Process. 17(6): 1231-1239 (2009) - [c22]Yao Qian, Zhizheng Wu, Frank K. Soong:
Improved prosody generation by maximizing joint likelihood of state and longer units. ICASSP 2009: 3781-3784 - [c21]Yining Chen, Yang Jiao, Yao Qian, Frank K. Soong:
State mapping for cross-language speaker adaptation in TTS. ICASSP 2009: 4273-4276 - [c20]Yao Qian, Frank K. Soong, Miaomiao Wang, Zhizheng Wu:
A minimum v/u error approach to F0 generation in HMM-based TTS. INTERSPEECH 2009: 408-411 - [c19]Zhi-Jie Yan, Yao Qian, Frank K. Soong:
Rich context modeling for high quality HMM-based TTS. INTERSPEECH 2009: 1755-1758 - 2008
- [j3]Yao Qian, Frank K. Soong, Tan Lee:
Tone-enhanced generalized character posterior probability (GCPP) for Cantonese LVCSR. Comput. Speech Lang. 22(4): 360-373 (2008) - [c18]Hui Liang, Yao Qian, Frank K. Soong, Gongshen Liu:
A cross-language state mapping approach to bilingual (Mandarin-English) TTS. ICASSP 2008: 4641-4644 - [c17]Yu Ting Yeung, Yao Qian, Tan Lee, Frank K. Soong:
Prosody for Mandarin speech recognition: a comparative study of read and spontaneous speech. INTERSPEECH 2008: 1133-1136 - [c16]Yao Qian, Hui Liang, Frank K. Soong:
Generating natural F0 trajectory with additive trees. INTERSPEECH 2008: 2126-2129 - [c15]Boyang Gao, Yao Qian, Zhizheng Wu, Frank K. Soong:
Duration refinement by jointly optimizing state and longer unit likelihood. INTERSPEECH 2008: 2266-2269 - [c14]Lijuan Wang, Xiaojun Qian, Lei Ma, Yao Qian, Yining Chen, Frank K. Soong:
A real-time text to audio-visual speech synthesis system. INTERSPEECH 2008: 2338-2341 - [c13]Yao Qian, Houwei Cao, Frank K. Soong:
HMM-Based Mixed-Language (Mandarin-English) Speech Synthesis. ISCSLP 2008: 13-16 - [c12]Zhizheng Wu, Yao Qian, Frank K. Soong, Bo Zhang:
Modeling and Generating Tone Contour with Phrase Intonation for Mandarin Chinese Speech. ISCSLP 2008: 121-124 - 2007
- [c11]Sheng Qiang, Yao Qian, Frank K. Soong, Congfu Xu:
Robust F0 modeling for Mandarin speech recognition in noise. INTERSPEECH 2007: 1801-1804 - [c10]Hui Liang, Yao Qian, Frank K. Soong:
An HMM-based bilingual (Mandarin-English) TTS. SSW 2007: 137-142 - 2006
- [c9]Yao Qian, Frank K. Soong, Tan Lee:
Tone-Enhanced Generalized Character Posterior Probability (GCPP) for Cantonese LVCSR. ICASSP (1) 2006: 133-136 - [c8]Huanliang Wang, Yao Qian, Frank K. Soong, Jian-Lai Zhou, Jiqing Han:
A multi-space distribution (MSD) approach to speech recognition of tonal languages. INTERSPEECH 2006 - [c7]Yao Qian, Frank K. Soong, Yining Chen, Min Chu:
An HMM-Based Mandarin Chinese Text-To-Speech System. ISCSLP (Selected Papers) 2006: 223-232 - [c6]Huanliang Wang, Yao Qian, Frank K. Soong, Jian-Lai Zhou, Jiqing Han:
Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tone Models. ISCSLP (Selected Papers) 2006: 445-453 - 2004
- [j2]Yujia Li, Tan Lee, Yao Qian:
Analysis and modeling of F0 contours for cantonese text-to-speech. ACM Trans. Asian Lang. Inf. Process. 3(3): 169-180 (2004) - [c5]Yao Qian, Tan Lee, Frank K. Soong:
Tone information as a confidence measure for improving Cantonese LVCSR. INTERSPEECH 2004: 1965-1968 - 2003
- [c4]Yao Qian, Tan Lee, Yujia Li:
Overlapped di-tone modeling for tone recognition in continuous Cantonese speech. INTERSPEECH 2003: 1845-1848 - 2002
- [c3]Yao Qian, Fang Chen:
Assigning phrase accent to Chinese Text-to-Speech system. ICASSP 2002: 485-488 - [c2]Yujia Li, Tan Lee, Yao Qian:
Acoustical F0 analysis of continuous cantonese speech. ISCSLP 2002 - 2001
- [j1]Min Chu, Yao Qian:
Locating Boundaries for Prosodic Constituents in Unrestricted Mandarin Texts. Int. J. Comput. Linguistics Chin. Lang. Process. 6(1) (2001) - [c1]Yao Qian, Min Chu, Hu Peng:
Segmenting unrestricted Chinese text into prosodic words instead of lexical words. ICASSP 2001: 825-828
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-10 21:18 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint