default search action
Dong Yu 0001
Person information
- unicode name: 俞栋
- affiliation: Tencent AI Lab, China
- affiliation (1998 - 2017): Microsoft Research, Redmond, WA, USA
- affiliation (PhD): University of Idaho, Moscow, ID, USA
Other persons with the same name
- Dong Yu — disambiguation page
- Dong Yu 0002 — Xi'an Jiaotong University, Institution of Advanced Manufacturing and Technology, China
- Dong Yu 0003 — Beijing Language and Culture University, Beijing, China
- Dong Yu 0004 — University of Chinese Academy of Sciences, Beijing, China (and 2 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j69]Hao Zhang, Yixuan Zhang, Meng Yu, Dong Yu:
Enhanced Acoustic Howling Suppression via Hybrid Kalman Filter and Deep Learning Models. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2828-2840 (2024) - [c319]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu:
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs. ACL (1) 2024: 267-278 - [c318]Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. ACL (1) 2024: 1764-1775 - [c317]Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu:
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models. ACL (1) 2024: 6864-6890 - [c316]Ante Wang, Linfeng Song, Baolin Peng, Lifeng Jin, Ye Tian, Haitao Mi, Jinsong Su, Dong Yu:
Improving LLM Generations via Fine-Grained Self-Endorsement. ACL (Findings) 2024: 8424-8436 - [c315]Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Tongshuang Wu, Jianshu Chen:
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models. ACL (Findings) 2024: 8702-8718 - [c314]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners. ACL (1) 2024: 10929-10942 - [c313]Yinya Huang, Ruixin Hong, Hongming Zhang, Wei Shao, Zhicheng Yang, Dong Yu, Changshui Zhang, Xiaodan Liang, Linqi Song:
CLOMO: Counterfactual Logical Modification with Large Language Models. ACL (1) 2024: 11012-11034 - [c312]Duzhen Zhang, Yahan Yu, Jiahua Dong, Chenxing Li, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. ACL (Findings) 2024: 12401-12430 - [c311]Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, Pengfei Liu, Dong Yu:
InFoBench: Evaluating Instruction Following Ability in Large Language Models. ACL (Findings) 2024: 13025-13048 - [c310]Xiangci Li, Linfeng Song, Lifeng Jin, Haitao Mi, Jessica Ouyang, Dong Yu:
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation. LREC/COLING 2024: 666-676 - [c309]Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng, Xiangliang Zhang, Dong Yu:
MinT: Boosting Generalization in Mathematical Reasoning via Multi-view Fine-tuning. LREC/COLING 2024: 11307-11318 - [c308]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu:
Inconsistent dialogue responses and how to recover from them. EACL (Findings) 2024: 220-230 - [c307]Haoyu Wang, Hongming Zhang, Kaiqiang Song, Dong Yu, Dan Roth:
Event Semantic Classification in Context. EACL (Findings) 2024: 1395-1407 - [c306]Ruixin Hong, Hongming Zhang, Xiaoman Pan, Dong Yu, Changshui Zhang:
Abstraction-of-Thought Makes Language Models Better Reasoners. EMNLP (Findings) 2024: 1993-2027 - [c305]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Wenlin Yao, Hassan Foroosh, Dong Yu, Fei Liu:
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives. EMNLP 2024: 4293-4308 - [c304]Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen:
Skills-in-Context: Unlocking Compositionality in Large Language Models. EMNLP (Findings) 2024: 13838-13890 - [c303]Wenhao Yu, Hongming Zhang, Xiaoman Pan, Peixin Cao, Kaixin Ma, Jian Li, Hongwei Wang, Dong Yu:
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models. EMNLP 2024: 14672-14685 - [c302]Zhihan Zhang, Tao Ge, Zhenwen Liang, Wenhao Yu, Dian Yu, Mengzhao Jia, Dong Yu, Meng Jiang:
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning. EMNLP 2024: 14720-14738 - [c301]Tong Chen, Hongwei Wang, Sihao Chen, Wenhao Yu, Kaixin Ma, Xinran Zhao, Hongming Zhang, Dong Yu:
Dense X Retrieval: What Retrieval Granularity Should We Use? EMNLP 2024: 15159-15177 - [c300]Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SPATIALCODEC: Neural Spatial Speech Coding. ICASSP 2024: 1131-1135 - [c299]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models. ICASSP 2024: 7125-7129 - [c298]Zili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech Processing. ICASSP 2024: 11991-11995 - [c297]Lingfeng Shen, Sihao Chen, Linfeng Song, Lifeng Jin, Baolin Peng, Haitao Mi, Daniel Khashabi, Dong Yu:
The Trickle-down Impact of Reward Inconsistency on RLHF. ICLR 2024 - [c296]Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen:
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. ICML 2024 - [c295]Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu:
Prompt-guided Precise Audio Editing with Diffusion Models. ICML 2024 - [c294]Ruixin Hong, Hongming Zhang, Xinyu Pang, Dong Yu, Changshui Zhang:
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning. NAACL-HLT 2024: 900-925 - [c293]Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu:
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning. NAACL-HLT 2024: 1287-1310 - [c292]Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu:
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations. NAACL-HLT 2024: 1596-1609 - [c291]Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu:
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning. NAACL-HLT 2024: 2341-2369 - [c290]Yuanyuan Lei, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu:
Polarity Calibration for Opinion Summarization. NAACL-HLT 2024: 5211-5224 - [i236]Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, Pengfei Liu, Dong Yu:
InFoBench: Evaluating Instruction Following Ability in Large Language Models. CoRR abs/2401.03601 (2024) - [i235]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu:
Inconsistent dialogue responses and how to recover from them. CoRR abs/2401.10353 (2024) - [i234]Duzhen Zhang, Yahan Yu, Chenxing Li, Jiahua Dong, Dan Su, Chenhui Chu, Dong Yu:
MM-LLMs: Recent Advances in MultiModal Large Language Models. CoRR abs/2401.13601 (2024) - [i233]Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu:
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models. CoRR abs/2401.13919 (2024) - [i232]Sangwoo Cho, Kaiqiang Song, Chao Zhao, Xiaoyang Wang, Dong Yu:
SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization. CoRR abs/2401.17597 (2024) - [i231]Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen:
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment. CoRR abs/2402.10207 (2024) - [i230]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu:
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs. CoRR abs/2402.10979 (2024) - [i229]Ante Wang, Linfeng Song, Baolin Peng, Ye Tian, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu:
Fine-Grained Self-Endorsement Improves Factuality and Reasoning. CoRR abs/2402.15631 (2024) - [i228]Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Tongshuang Wu, Jianshu Chen:
Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models. CoRR abs/2402.17124 (2024) - [i227]Lifeng Jin, Baolin Peng, Linfeng Song, Haitao Mi, Ye Tian, Dong Yu:
Collaborative decoding of critical tokens for boosting factuality of large language models. CoRR abs/2402.17982 (2024) - [i226]Xiangci Li, Linfeng Song, Lifeng Jin, Haitao Mi, Jessica Ouyang, Dong Yu:
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation. CoRR abs/2403.03496 (2024) - [i225]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu:
Can Large Language Models do Analytical Reasoning? CoRR abs/2403.04031 (2024) - [i224]Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu:
Self-Consistency Boosts Calibration for Math Reasoning. CoRR abs/2403.09849 (2024) - [i223]Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu:
Conceptual and Unbiased Reasoning in Language Models. CoRR abs/2404.00205 (2024) - [i222]Yuanyuan Lei, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu:
Polarity Calibration for Opinion Summarization. CoRR abs/2404.01706 (2024) - [i221]Souvik Das, Lifeng Jin, Linfeng Song, Haitao Mi, Baolin Peng, Dong Yu:
Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models. CoRR abs/2404.09338 (2024) - [i220]Ye Tian, Baolin Peng, Linfeng Song, Lifeng Jin, Dian Yu, Haitao Mi, Dong Yu:
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing. CoRR abs/2404.12253 (2024) - [i219]Zhenwen Liang, Dian Yu, Wenhao Yu, Wenlin Yao, Zhihan Zhang, Xiangliang Zhang, Dong Yu:
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions. CoRR abs/2405.19444 (2024) - [i218]Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. CoRR abs/2406.00976 (2024) - [i217]Manjie Xu, Chenxing Li, Duzhen Zhang, Dan Su, Wei Liang, Dong Yu:
Prompt-guided Precise Audio Editing with Diffusion Models. CoRR abs/2406.04350 (2024) - [i216]Zhihan Zhang, Zhenwen Liang, Wenhao Yu, Dian Yu, Mengzhao Jia, Dong Yu, Meng Jiang:
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning. CoRR abs/2406.12050 (2024) - [i215]Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Wenlin Yao, Hassan Foroosh, Dong Yu, Fei Liu:
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives. CoRR abs/2406.12084 (2024) - [i214]Ruixin Hong, Hongming Zhang, Xiaoman Pan, Dong Yu, Changshui Zhang:
Abstraction-of-Thought Makes Language Models Better Reasoners. CoRR abs/2406.12442 (2024) - [i213]Xin Chan, Xiaoyang Wang, Dian Yu, Haitao Mi, Dong Yu:
Scaling Synthetic Data Creation with 1,000,000,000 Personas. CoRR abs/2406.20094 (2024) - [i212]Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Dian Yu, Haitao Mi, Jinsong Su, Dong Yu:
LiteSearch: Efficacious Tree Search for LLM. CoRR abs/2407.00320 (2024) - [i211]Yuheng Zhang, Dian Yu, Baolin Peng, Linfeng Song, Ye Tian, Mingyue Huo, Nan Jiang, Haitao Mi, Dong Yu:
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning. CoRR abs/2407.00617 (2024) - [i210]Manjie Xu, Chenxing Li, Yong Ren, Rilin Chen, Yu Gu, Wei Liang, Dong Yu:
Video-to-Audio Generation with Hidden Alignment. CoRR abs/2407.07464 (2024) - [i209]Anni Zou, Wenhao Yu, Hongming Zhang, Kaixin Ma, Deng Cai, Zhuosheng Zhang, Hai Zhao, Dong Yu:
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems. CoRR abs/2407.10701 (2024) - [i208]Dian Yu, Baolin Peng, Ye Tian, Linfeng Song, Haitao Mi, Dong Yu:
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models. CoRR abs/2408.15565 (2024) - [i207]Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong Yu:
Advancing Multi-talker ASR Performance with Large Language Models. CoRR abs/2408.17431 (2024) - [i206]Yaoxun Xu, Shi-Xiong Zhang, Jianwei Yu, Zhiyong Wu, Dong Yu:
Comparing Discrete and Continuous Space LLMs for Speech Recognition. CoRR abs/2409.00800 (2024) - [i205]Helin Wang, Meng Yu, Jiarui Hai, Chen Chen, Yuchen Hu, Rilin Chen, Najim Dehak, Dong Yu:
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis. CoRR abs/2409.07556 (2024) - [i204]Liqiang Jing, Zhehui Huang, Xiaoyang Wang, Wenlin Yao, Wenhao Yu, Kaixin Ma, Hongming Zhang, Xinya Du, Dong Yu:
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? CoRR abs/2409.07703 (2024) - [i203]Yong Ren, Chenxing Li, Manjie Xu, Wei Liang, Yu Gu, Rilin Chen, Dong Yu:
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment. CoRR abs/2409.08601 (2024) - [i202]Manjie Xu, Chenxing Li, Xinyi Tu, Yong Ren, Ruibo Fu, Wei Liang, Dong Yu:
Towards Diverse and Efficient Audio Captioning via Diffusion Models. CoRR abs/2409.09401 (2024) - [i201]Hongming Zhang, Xiaoman Pan, Hongwei Wang, Kaixin Ma, Wenhao Yu, Dong Yu:
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots. CoRR abs/2409.10277 (2024) - [i200]Jiarui Hai, Yong Xu, Hao Zhang, Chenxing Li, Helin Wang, Mounya Elhilali, Dong Yu:
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer. CoRR abs/2409.10819 (2024) - [i199]Jinchuan Tian, Chunlei Zhang, Jiatong Shi, Hao Zhang, Jianwei Yu, Shinji Watanabe, Dong Yu:
Preference Alignment Improves Language Model-Based TTS. CoRR abs/2409.12403 (2024) - [i198]Yuchen Hu, Yu Gu, Chenxing Li, Rilin Chen, Dong Yu:
Video-to-Audio Generation with Fine-grained Temporal Semantics. CoRR abs/2409.14709 (2024) - [i197]Wenlin Yao, Haitao Mi, Dong Yu:
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows. CoRR abs/2409.17433 (2024) - [i196]Mengzhao Jia, Wenhao Yu, Kaixin Ma, Tianqing Fang, Zhihan Zhang, Siru Ouyang, Hongming Zhang, Meng Jiang, Dong Yu:
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks. CoRR abs/2410.01744 (2024) - [i195]Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu:
DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning. CoRR abs/2410.01772 (2024) - [i194]Zhaowei Wang, Hongming Zhang, Tianqing Fang, Ye Tian, Yue Yang, Kaixin Ma, Xiaoman Pan, Yangqiu Song, Dong Yu:
DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects. CoRR abs/2410.02730 (2024) - [i193]Murong Yue, Wenlin Yao, Haitao Mi, Dian Yu, Ziyu Yao, Dong Yu:
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search. CoRR abs/2410.03864 (2024) - [i192]Daoan Zhang, Guangchen Lan, Dong-Jun Han, Wenlin Yao, Xiaoman Pan, Hongming Zhang, Mingxiao Li, Pengcheng Chen, Dong Yu, Christopher Brinton, Jiebo Luo:
SePPO: Semi-Policy Preference Optimization for Diffusion Alignment. CoRR abs/2410.05255 (2024) - [i191]Zilin Xiao, Hongming Zhang, Tao Ge, Siru Ouyang, Vicente Ordonez, Dong Yu:
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding. CoRR abs/2410.05589 (2024) - [i190]Xiyao Wang, Linfeng Song, Ye Tian, Dian Yu, Baolin Peng, Haitao Mi, Furong Huang, Dong Yu:
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning. CoRR abs/2410.06508 (2024) - [i189]Chenxing Li, Manjie Xu, Dong Yu:
SRC-gAudio: Sampling-Rate-Controlled Audio Generation. CoRR abs/2410.06544 (2024) - [i188]Shwai He, Tao Ge, Guoheng Sun, Bowei Tian, Xiaoyang Wang, Ang Li, Dong Yu:
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers. CoRR abs/2410.13184 (2024) - [i187]Siru Ouyang, Wenhao Yu, Kaixin Ma, Zilin Xiao, Zhihan Zhang, Mengzhao Jia, Jiawei Han, Hongming Zhang, Dong Yu:
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph. CoRR abs/2410.14684 (2024) - [i186]Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Hongming Zhang, Tianqing Fang, Zhenzhong Lan, Dong Yu:
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization. CoRR abs/2410.19609 (2024) - 2023
- [j68]Ante Wang, Linfeng Song, Qi Liu, Haitao Mi, Longyue Wang, Zhaopeng Tu, Jinsong Su, Dong Yu:
Search-engine-augmented dialogue response generation with cheaply supervised query production. Artif. Intell. 319: 103874 (2023) - [j67]Katerina Zmolíková, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocký, Dong Yu:
Neural Target Speech Extraction: An overview. IEEE Signal Process. Mag. 40(3): 8-29 (2023) - [j66]Dong Yu, Yifan Gong, Michael A. Picheny, Bhuvana Ramabhadran, Dilek Hakkani-Tür, Rohit Prasad, Heiga Zen, Jan Skoglund, Jan Honza Cernocký, Lukás Burget, Abdelrahman Mohamed:
Twenty-Five Years of Evolution in Speech and Language Processing. IEEE Signal Process. Mag. 40(5): 27-39 (2023) - [j65]Linfeng Song, Ante Wang, Xiaoman Pan, Hongming Zhang, Dian Yu, Lifeng Jin, Haitao Mi, Jinsong Su, Yue Zhang, Dong Yu:
OpenFact: Factuality Enhanced Open Knowledge Extraction. Trans. Assoc. Comput. Linguistics 11: 686-702 (2023) - [j64]Wenyue Hua, Lifeng Jin, Linfeng Song, Haitao Mi, Yongfeng Zhang, Dong Yu:
Discover, Explain, Improve: An Automatic Slice Detection Benchmark for Natural Language Processing. Trans. Assoc. Comput. Linguistics 11: 1537-1552 (2023) - [j63]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Integrating Lattice-Free MMI Into End-to-End Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 31: 25-38 (2023) - [j62]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 849-862 (2023) - [j61]Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu:
Diffsound: Discrete Diffusion Model for Text-to-Sound Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1720-1733 (2023) - [j60]Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu:
Unsupervised TTS Acoustic Modeling for TTS With Conditional Disentangled Sequential VAE. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2548-2557 (2023) - [j59]Ante Wang, Linfeng Song, Lifeng Jin, Junfeng Yao, Haitao Mi, Chen Lin, Jinsong Su, Dong Yu:
D$^{2}$PSG: Multi-Party Dialogue Discourse Parsing as Sequence Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 4004-4013 (2023) - [c289]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Wenliang Chen, Dong Yu:
SafeConv: Explaining and Correcting Conversational Unsafe Behavior. ACL (1) 2023: 22-35 - [c288]Hongwei Wang, Dong Yu:
Going Beyond Sentence Embeddings: A Token-Level Matching Algorithm for Calculating Semantic Textual Similarity. ACL (2) 2023: 563-570 - [c287]Pengshan Cai, Kaiqiang Song, Sangwoo Cho, Hongwei Wang, Xiaoyang Wang, Hong Yu, Fei Liu, Dong Yu:
Generating User-Engaging News Headlines. ACL (1) 2023: 3265-3280 - [c286]Ruixin Hong, Hongming Zhang, Hong Zhao, Dong Yu, Changshui Zhang:
Faithful Question Answering with Monte-Carlo Planning. ACL (1) 2023: 3944-3965 - [c285]Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, Heng Ji:
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks. ACL (Findings) 2023: 3978-4004 - [c284]Xianjun Yang, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Xiaoman Pan, Linda R. Petzold, Dong Yu:
OASum: Large-Scale Open Domain Aspect-based Summarization. ACL (Findings) 2023: 4381-4401 - [c283]Rongjie Huang, Chunlei Zhang, Yi Ren, Zhou Zhao, Dong Yu:
Prosody-TTS: Improving Prosody with Masked Autoencoder and Conditional Diffusion Model For Expressive Text-to-Speech. ACL (Findings) 2023: 8018-8034 - [c282]Sai Ashish Somayajula, Lifeng Jin, Linfeng Song, Haitao Mi, Dong Yu:
Bi-level Finetuning with Task-dependent Similarity Structure for Low-resource Training. ACL (Findings) 2023: 8569-8588 - [c281]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement. ASRU 2023: 1-8 - [c280]Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang:
Neuralkalman: A Learnable Kalman Filter for Acoustic Echo Cancellation. ASRU 2023: 1-7 - [c279]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Xiabing Zhou, Dong Yu:
Friend-training: Learning from Models of Different but Related Tasks. EACL 2023: 232-247 - [c278]Wenlin Yao, Lifeng Jin, Hongming Zhang, Xiaoman Pan, Kaiqiang Song, Dian Yu, Dong Yu, Jianshu Chen:
How do Words Contribute to Sentence Semantics? Revisiting Sentence Embeddings with a Perturbation Method. EACL 2023: 2993-3002 - [c277]Hongwei Wang, Hongming Zhang, Dong Yu:
On the Dimensionality of Sentence Embeddings. EMNLP (Findings) 2023: 10344-10354 - [c276]James Y. Huang, Wenlin Yao, Kaiqiang Song, Hongming Zhang, Muhao Chen, Dong Yu:
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations. EMNLP 2023: 14584-14595 - [c275]Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen:
PIVOINE: Instruction Tuning for Open-world Entity Profiling. EMNLP (Findings) 2023: 15108-15127 - [c274]Dian Yu, Xiaoyang Wang, Wanshun Chen, Nan Du, Longyue Wang, Haitao Mi, Dong Yu:
More Than Spoken Words: Nonverbal Message Extraction and Generation. EMNLP 2023: 16396-16413 - [c273]Lixin Cao, Jun Wang, Ben Yang, Dan Su, Dong Yu:
Trinet: Stabilizing Self-Supervised Learning From Complete or Slow Collapse. ICASSP 2023: 1-5 - [c272]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for in-Car Speech Separation. ICASSP 2023: 1-5 - [c271]Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen:
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models. ICLR 2023 - [c270]Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk CTC: Controllable CTC Alignment in Sequence-to-Sequence Tasks. ICLR 2023 - [c269]Haopeng Zhang, Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Hongwei Wang, Jiawei Zhang, Dong Yu:
Unsupervised Multi-document Summarization with Holistic Inference. IJCNLP (Findings) 2023: 123-133 - [c268]Wei Xiao, Wenzhe Liu, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
Multi-mode Neural Speech Coding Based on Deep Generative Networks. INTERSPEECH 2023: 819-823 - [c267]Hao Zhang, Meng Yu, Yuzhong Wu, Tao Yu, Dong Yu:
Hybrid AHS: A Hybrid of Kalman Filter and Deep Learning for Acoustic Howling Suppression. INTERSPEECH 2023: 834-838 - [c266]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. INTERSPEECH 2023: 1334-1338 - [c265]Yuping Yuan, Zhao You, Shulin Feng, Dan Su, Yanchun Liang, Xiaohu Shi, Dong Yu:
Compressed MoE ASR Model Based on Knowledge Distillation and Quantization. INTERSPEECH 2023: 3337-3341 - [c264]Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. INTERSPEECH 2023: 4968-4972 - [c263]Yong Xu, Vinay Kothapally, Meng Yu, Shixiong Zhang, Dong Yu:
Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation. INTERSPEECH 2023: 5117-5121 - [c262]Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen:
Thrust: Adaptively Propels Large Language Models with External Knowledge. NeurIPS 2023 - [i185]Lixin Cao, Jun Wang, Ben Yang, Dan Su, Dong Yu:
TriNet: stabilizing self-supervised learning from complete or slow collapse. CoRR abs/2301.00656 (2023) - [i184]Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang:
KalmanNet: A Learnable Kalman Filter for Acoustic Echo Cancellation. CoRR abs/2301.12363 (2023) - [i183]Katerina Zmolíková, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocký, Dong Yu:
Neural Target Speech Extraction: An Overview. CoRR abs/2301.13341 (2023) - [i182]Dongchao Yang, Songxiang Liu, Rongjie Huang, Guangzhi Lei, Chao Weng, Helen Meng, Dong Yu:
InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt. CoRR abs/2301.13662 (2023) - [i181]Mian Zhang, Lifeng Jin, Linfeng Song, Haitao Mi, Xiabing Zhou, Dong Yu:
Friend-training: Learning from Models of Different but Related Tasks. CoRR abs/2301.13683 (2023) - [i180]Ante Wang, Linfeng Song, Qi Liu, Haitao Mi, Longyue Wang, Zhaopeng Tu, Jinsong Su, Dong Yu:
Search-Engine-augmented Dialogue Response Generation with Cheaply Supervised Query Production. CoRR abs/2302.09300 (2023) - [i179]Rongzhi Gu, Shi-Xiong Zhang, Dong Yu:
3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty. CoRR abs/2302.13462 (2023) - [i178]Ruixin Hong, Hongming Zhang, Hong Zhao, Dong Yu, Changshui Zhang:
Faithful Question Answering with Monte-Carlo Planning. CoRR abs/2305.02556 (2023) - [i177]Siyi Liu, Hongming Zhang, Hongwei Wang, Kaiqiang Song, Dan Roth, Dong Yu:
Open-Domain Event Graph Induction for Mitigating Framing Bias. CoRR abs/2305.12835 (2023) - [i176]James Y. Huang, Wenlin Yao, Kaiqiang Song, Hongming Zhang, Muhao Chen, Dong Yu:
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations. CoRR abs/2305.14599 (2023) - [i175]Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, Jianshu Chen:
PIVOINE: Instruction Tuning for Open-world Information Extraction. CoRR abs/2305.14898 (2023) - [i174]Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Luping Liu, Zhenhui Ye, Ziyue Jiang, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Unified Voice Synthesis With Discrete Representation. CoRR abs/2305.19269 (2023) - [i173]Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu:
A Stitch in Time Saves Nine: Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation. CoRR abs/2307.03987 (2023) - [i172]Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng, Xiangliang Zhang, Dong Yu:
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning. CoRR abs/2307.07951 (2023) - [i171]Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen:
Thrust: Adaptively Propels Large Language Models with External Knowledge. CoRR abs/2307.10442 (2023) - [i170]Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen:
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models. CoRR abs/2308.00304 (2023) - [i169]Jinchuan Tian, Jianwei Yu, Hangting Chen, Brian Yan, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes Risk Transducer: Transducer with Controllable Alignment Prediction. CoRR abs/2308.10107 (2023) - [i168]Jiaxu Zhu, Weinan Tong, Yaoxun Xu, Changhe Song, Zhiyong Wu, Zhao You, Dan Su, Dong Yu, Helen M. Meng:
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation. CoRR abs/2309.02459 (2023) - [i167]Haopeng Zhang, Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Hongwei Wang, Jiawei Zhang, Dong Yu:
Unsupervised Multi-document Summarization with Holistic Inference. CoRR abs/2309.04087 (2023) - [i166]Anton Ratnarajah, Shi-Xiong Zhang, Dong Yu:
M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec. CoRR abs/2309.07416 (2023) - [i165]Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SpatialCodec: Neural Spatial Speech Coding. CoRR abs/2309.07432 (2023) - [i164]Kaixin Ma, Hongming Zhang, Hongwei Wang, Xiaoman Pan, Dong Yu:
LASER: LLM Agent with State-Space Exploration for Web Navigation. CoRR abs/2309.08172 (2023) - [i163]Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu:
Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions. CoRR abs/2309.09028 (2023) - [i162]Baolin Peng, Linfeng Song, Ye Tian, Lifeng Jin, Haitao Mi, Dong Yu:
Stabilizing RLHF through Advantage Model and Selective Rehearsal. CoRR abs/2309.10202 (2023) - [i161]Lingfeng Shen, Sihao Chen, Linfeng Song, Lifeng Jin, Baolin Peng, Haitao Mi, Daniel Khashabi, Dong Yu:
The Trickle-down Impact of Reward (In-)consistency on RLHF. CoRR abs/2309.16155 (2023) - [i160]Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu:
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning. CoRR abs/2310.00492 (2023) - [i159]Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models. CoRR abs/2310.00900 (2023) - [i158]Wenzhe Liu, Wei Xiao, Meng Wang, Shan Yang, Yupeng Shi, Yuyong Kang, Dan Su, Shidong Shang, Dong Yu:
A High Fidelity and Low Complexity Neural Audio Coding. CoRR abs/2310.10992 (2023) - [i157]Hongwei Wang, Hongming Zhang, Dong Yu:
On the Dimensionality of Sentence Embeddings. CoRR abs/2310.15285 (2023) - [i156]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
RIR-SF: Room Impulse Response Based Spatial Feature for Multi-channel Multi-talker ASR. CoRR abs/2311.00146 (2023) - [i155]Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu:
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations. CoRR abs/2311.04335 (2023) - [i154]Shuyi Xie, Wenlin Yao, Yong Dai, Shaobo Wang, Donlin Zhou, Lifeng Jin, Xinhua Feng, Pengzhi Wei, Yujie Lin, Zhichao Hu, Dong Yu, Zhengyou Zhang, Jing Nie, Yuhong Liu:
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs. CoRR abs/2311.05374 (2023) - [i153]Ruixin Hong, Hongming Zhang, Xinyu Pang, Dong Yu, Changshui Zhang:
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning. CoRR abs/2311.07954 (2023) - [i152]Wenhao Yu, Hongming Zhang, Xiaoman Pan, Kaixin Ma, Hongwei Wang, Dong Yu:
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models. CoRR abs/2311.09210 (2023) - [i151]Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu:
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning. CoRR abs/2311.10774 (2023) - [i150]Yinya Huang, Ruixin Hong, Hongming Zhang, Wei Shao, Zhicheng Yang, Dong Yu, Changshui Zhang, Xiaodan Liang, Linqi Song:
CLOMO: Counterfactual Logical Modification with Large Language Models. CoRR abs/2311.17438 (2023) - [i149]Tong Chen, Hongwei Wang, Sihao Chen, Wenhao Yu, Kaixin Ma, Xinran Zhao, Hongming Zhang, Dong Yu:
Dense X Retrieval: What Retrieval Granularity Should We Use? CoRR abs/2312.06648 (2023) - [i148]Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu:
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention. CoRR abs/2312.08618 (2023) - 2022
- [j58]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer. Comput. Speech Lang. 73: 101327 (2022) - [j57]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition. Comput. Speech Lang. 75: 101360 (2022) - [j56]Chunlei Zhang, Dong Yu:
C3-DINO: Joint Contrastive and Non-Contrastive Self-Supervised Learning for Speaker Verification. IEEE J. Sel. Top. Signal Process. 16(6): 1273-1283 (2022) - [j55]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Improving Mandarin End-to-End Speech Recognition With Word N-Gram Language Model. IEEE Signal Process. Lett. 29: 812-816 (2022) - [j54]Linchao Bao, Xiangkai Lin, Yajing Chen, Haoxian Zhang, Sheng Wang, Xuefei Zhe, Di Kang, Haozhi Huang, Xinwei Jiang, Jue Wang, Dong Yu, Zhengyou Zhang:
High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies. ACM Trans. Graph. 41(1): 3:1-3:21 (2022) - [c261]Lisa Jin, Linfeng Song, Lifeng Jin, Dong Yu, Daniel Gildea:
Hierarchical Context Tagging for Utterance Rewriting. AAAI 2022: 10849-10857 - [c260]Chao Zhao, Wenlin Yao, Dian Yu, Kaiqiang Song, Dong Yu, Jianshu Chen:
Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension. ACL (2) 2022: 212-218 - [c259]Xiang Yue, Xiaoman Pan, Wenlin Yao, Dian Yu, Dong Yu, Jianshu Chen:
C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References. ACL (2) 2022: 371-377 - [c258]Irene Li, Linfeng Song, Kun Xu, Dong Yu:
Variational Graph Autoencoding as Cheap Supervision for AMR Coreference Resolution. ACL (1) 2022: 2790-2800 - [c257]Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Fei Liu:
Towards Abstractive Grounded Summarization of Podcast Transcripts. ACL (1) 2022: 4407-4418 - [c256]Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Claire Cardie:
Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge. ACL (1) 2022: 8736-8747 - [c255]Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Fei Liu, Dong Yu:
Toward Unifying Text Segmentation and Long Document Summarization. EMNLP 2022: 106-118 - [c254]Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu, Jiebo Luo:
Learning a Grammar Inducer from Massive Uncurated Instructional Videos. EMNLP 2022: 233-247 - [c253]Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen:
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination. EMNLP 2022: 1186-1203 - [c252]Yinya Huang, Hongming Zhang, Ruixin Hong, Xiaodan Liang, Changshui Zhang, Dong Yu:
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure. EMNLP 2022: 4698-4724 - [c251]Peng Shi, Linfeng Song, Lifeng Jin, Haitao Mi, He Bai, Jimmy Lin, Dong Yu:
Cross-lingual Text-to-SQL Semantic Parsing with Representation Mixup. EMNLP (Findings) 2022: 5296-5306 - [c250]Fei Wang, Kaiqiang Song, Hongming Zhang, Lifeng Jin, Sangwoo Cho, Wenlin Yao, Xiaoyang Wang, Muhao Chen, Dong Yu:
Salience Allocation as Guidance for Abstractive Summarization. EMNLP 2022: 6094-6106 - [c249]Hongming Zhang, Wenlin Yao, Dong Yu:
Efficient Zero-shot Event Extraction with Context-Definition Alignment. EMNLP (Findings) 2022: 7169-7179 - [c248]Jianqiao Zhao, Yanyang Li, Wanyu Du, Yangfeng Ji, Dong Yu, Michael R. Lyu, Liwei Wang:
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows. EMNLP 2022: 10469-10483 - [c247]Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator. ICASSP 2022: 571-575 - [c246]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. ICASSP 2022: 6067-6071 - [c245]Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards Reference-Free Cross-Speaker Style Transfer with Low-Quality Data for Expressive Speech Synthesis. ICASSP 2022: 6307-6311 - [c244]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. ICASSP 2022: 6412-6416 - [c243]Jiachen Lian, Chunlei Zhang, Dong Yu:
Robust Disentangled Variational Speech Representation Learning for Zero-Shot Voice Conversion. ICASSP 2022: 6572-6576 - [c242]Zhao You, Shulin Feng, Dan Su, Dong Yu:
Speechmoe2: Mixture-of-Experts Model with Improved Routing. ICASSP 2022: 7217-7221 - [c241]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. ICASSP 2022: 7252-7256 - [c240]Dongpeng Ma, Yiwen Wang, Liqiang He, Mingjie Jin, Dan Su, Dong Yu:
DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. ICASSP 2022: 7692-7696 - [c239]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI. ICASSP 2022: 7782-7786 - [c238]Chunlei Zhang, Jiatong Shi, Chao Weng, Meng Yu, Dong Yu:
Towards end-to-end Speaker Diarization with Generalized Neural Speaker Clustering. ICASSP 2022: 8372-8376 - [c237]Pei Chen, Wenlin Yao, Hongming Zhang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen:
ZeroKBC: A Comprehensive Benchmark for Zero-Shot Knowledge Base Completion. ICDM (Workshops) 2022: 1-6 - [c236]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis. ICLR 2022 - [c235]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. IJCAI 2022: 4157-4163 - [c234]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint Neural AEC and Beamforming with Double-Talk Detection. INTERSPEECH 2022: 2528-2532 - [c233]Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu:
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE. INTERSPEECH 2022: 2598-2602 - [c232]Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Yuexian Zou, Dong Yu:
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR. INTERSPEECH 2022: 3178-3182 - [c231]Ziqian Dai, Jianwei Yu, Yan Wang, Nuo Chen, Yanyao Bian, Guangzhi Li, Deng Cai, Dong Yu:
Automatic Prosody Annotation with Pre-Trained Text-Speech Model. INTERSPEECH 2022: 5513-5517 - [c230]Zhao You, Shulin Feng, Dan Su, Dong Yu:
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition. ISCSLP 2022: 170-174 - [c229]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - [c228]Dian Yu, Ben Zhou, Dong Yu:
End-to-End Chinese Speaker Identification. NAACL-HLT 2022: 2274-2285 - [c227]Junyi Peng, Chunlei Zhang, Jan Honza Cernocký, Dong Yu:
Progressive Contrastive Learning for Self-Supervised Text-Independent Speaker Verification. Odyssey 2022: 17-24 - [c226]Jia Cui, Heng Lu, Wenjie Wang, Shiyin Kang, Liqiang He, Guangzhi Li, Dong Yu:
Efficient Text Analysis with Pre-Trained Neural Network Models. SLT 2022: 671-676 - [c225]Zhenyi Wang, Xiaoyang Wang, Li Shen, Qiuling Suo, Kaiqiang Song, Dong Yu, Yan Shen, Mingchen Gao:
Meta-learning without data via Wasserstein distributionally-robust model fusion. UAI 2022: 2045-2055 - [e1]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents] - [i147]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model. CoRR abs/2201.01995 (2022) - [i146]Songxiang Liu, Dan Su, Dong Yu:
DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs. CoRR abs/2201.11972 (2022) - [i145]Jianqiao Zhao, Yanyang Li, Wanyu Du, Yangfeng Ji, Dong Yu, Michael R. Lyu, Liwei Wang:
FlowEval: A Consensus-Based Dialogue Evaluation Framework Using Segment Act Flows. CoRR abs/2202.06633 (2022) - [i144]Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion. CoRR abs/2202.09081 (2022) - [i143]Xiang Yue, Xiaoman Pan, Wenlin Yao, Dian Yu, Dong Yu, Jianshu Chen:
C-MORE: Pretraining to Answer Open-Domain Questions by Consulting Millions of References. CoRR abs/2203.08928 (2022) - [i142]Chao Zhao, Wenlin Yao, Dian Yu, Kaiqiang Song, Dong Yu, Jianshu Chen:
Learning-by-Narrating: Narrative Pre-Training for Zero-Shot Dialogue Comprehension. CoRR abs/2203.10249 (2022) - [i141]Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Fei Liu:
Towards Abstractive Grounded Summarization of Podcast Transcripts. CoRR abs/2203.11425 (2022) - [i140]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis. CoRR abs/2203.13508 (2022) - [i139]Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu:
Integrate Lattice-Free MMI into End-to-End Speech Recognition. CoRR abs/2203.15614 (2022) - [i138]Jiachen Lian, Chunlei Zhang, Dong Yu:
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion. CoRR abs/2203.16705 (2022) - [i137]Zhao You, Shulin Feng, Dan Su, Dong Yu:
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition. CoRR abs/2204.03178 (2022) - [i136]Rongjie Huang, Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu, Yi Ren, Zhou Zhao:
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis. CoRR abs/2204.09934 (2022) - [i135]Lifeng Jin, Kun Xu, Linfeng Song, Dong Yu:
Distant finetuning with discourse relations for stance classification. CoRR abs/2204.12693 (2022) - [i134]Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu:
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE. CoRR abs/2205.05227 (2022) - [i133]Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu:
NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement. CoRR abs/2205.10401 (2022) - [i132]Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Chao Weng, Yuexian Zou, Dong Yu:
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR. CoRR abs/2206.02093 (2022) - [i131]Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu:
UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder. CoRR abs/2206.02512 (2022) - [i130]Ziqian Dai, Jianwei Yu, Yan Wang, Nuo Chen, Yanyao Bian, Guangzhi Li, Deng Cai, Dong Yu:
Automatic Prosody Annotation with Pre-Trained Text-Speech Model. CoRR abs/2206.07956 (2022) - [i129]Lisa Jin, Linfeng Song, Lifeng Jin, Dong Yu, Daniel Gildea:
Hierarchical Context Tagging for Utterance Rewriting. CoRR abs/2206.11218 (2022) - [i128]Dongchao Yang, Jianwei Yu, Helin Wang, Wen Wang, Chao Weng, Yuexian Zou, Dong Yu:
Diffsound: Discrete Diffusion Model for Text-to-sound Generation. CoRR abs/2207.09983 (2022) - [i127]Chunlei Zhang, Dong Yu:
C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification. CoRR abs/2208.07446 (2022) - [i126]Zhenhailong Wang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen, Heng Ji:
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks. CoRR abs/2210.00185 (2022) - [i125]Ben Zhou, Dian Yu, Dong Yu, Dan Roth:
Cross-Lingual Speaker Identification Using Distant Supervision. CoRR abs/2210.05780 (2022) - [i124]Jinchuan Tian, Brian Yan, Jianwei Yu, Chao Weng, Dong Yu, Shinji Watanabe:
Bayes risk CTC: Controllable CTC alignment in Sequence-to-Sequence tasks. CoRR abs/2210.07499 (2022) - [i123]Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen:
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination. CoRR abs/2210.12261 (2022) - [i122]Songyang Zhang, Linfeng Song, Lifeng Jin, Haitao Mi, Kun Xu, Dong Yu, Jiebo Luo:
Learning a Grammar Inducer from Massive Uncurated Instructional Videos. CoRR abs/2210.12309 (2022) - [i121]Fei Wang, Kaiqiang Song, Hongming Zhang, Lifeng Jin, Sangwoo Cho, Wenlin Yao, Xiaoyang Wang, Muhao Chen, Dong Yu:
Salience Allocation as Guidance for Abstractive Summarization. CoRR abs/2210.12330 (2022) - [i120]Yinya Huang, Hongming Zhang, Ruixin Hong, Xiaodan Liang, Changshui Zhang, Dong Yu:
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure. CoRR abs/2210.12487 (2022) - [i119]Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Fei Liu, Dong Yu:
Toward Unifying Text Segmentation and Long Document Summarization. CoRR abs/2210.16422 (2022) - [i118]Xiaoman Pan, Wenlin Yao, Hongming Zhang, Dian Yu, Dong Yu, Jianshu Chen:
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models. CoRR abs/2210.16433 (2022) - [i117]Wenyue Hua, Lifeng Jin, Linfeng Song, Haitao Mi, Yongfeng Zhang, Dong Yu:
Discover, Explanation, Improvement: Automatic Slice Detection Framework for Natural Language Processing. CoRR abs/2211.04476 (2022) - [i116]Hongming Zhang, Wenlin Yao, Dong Yu:
Efficient Zero-shot Event Extraction with Context-Definition Alignment. CoRR abs/2211.05156 (2022) - [i115]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for In-car Speech Separation. CoRR abs/2211.12590 (2022) - [i114]Pei Chen, Wenlin Yao, Hongming Zhang, Xiaoman Pan, Dian Yu, Dong Yu, Jianshu Chen:
ZeroKBC: A Comprehensive Benchmark for Zero-Shot Knowledge Base Completion. CoRR abs/2212.03091 (2022) - [i113]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation. CoRR abs/2212.08348 (2022) - [i112]Xianjun Yang, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Xiaoman Pan, Linda R. Petzold, Dong Yu:
OASum: Large-Scale Open Domain Aspect-based Summarization. CoRR abs/2212.09233 (2022) - 2021
- [j53]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Complex Neural Spatial Filter: Enhancing Multi-Channel Target Speech Separation in Complex Domain. IEEE Signal Process. Lett. 28: 1370-1374 (2021) - [j52]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1368-1396 (2021) - [j51]Jianwei Yu, Shi-Xiong Zhang, Bo Wu, Shansong Liu, Shoukang Hu, Mengzhe Geng, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2067-2082 (2021) - [j50]Kun Xu, Han Wu, Linfeng Song, Haisong Zhang, Linqi Song, Dong Yu:
Conversational Semantic Role Labeling. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2465-2475 (2021) - [j49]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3526-3540 (2021) - [c224]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect. AAAI 2021: 13961-13969 - [c223]Xiaoyang Wang, Chen Li, Jianqiao Zhao, Dong Yu:
NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation. AAAI 2021: 14006-14014 - [c222]Lemao Liu, Haisong Zhang, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Dick Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi:
TexSmart: A System for Enhanced Natural Language Understanding. ACL (demo) 2021: 1-10 - [c221]Tianqing Fang, Haojie Pan, Hongming Zhang, Yangqiu Song, Kun Xu, Dong Yu:
Do Boat and Ocean Suggest Beach? Dialogue Summarization with External Knowledge. AKBC 2021 - [c220]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. APSIPA ASC 2021: 1433-1437 - [c219]Liqiang He, Shulin Feng, Dan Su, Dong Yu:
Latency-Controlled Neural Architecture Search for Streaming Speech Recognition. ASRU 2021: 62-67 - [c218]Rongzhi Gu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
3D Spatial Features for Multi-Channel Target Speech Separation. ASRU 2021: 996-1002 - [c217]Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu:
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation. CVPR 2021: 14090-14100 - [c216]Dian Yu, Kai Sun, Dong Yu, Claire Cardie:
Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering Data. EMNLP (Findings) 2021: 56-68 - [c215]Xintong Yu, Hongming Zhang, Yangqiu Song, Changshui Zhang, Kun Xu, Dong Yu:
Exophoric Pronoun Resolution in Dialogues with Topic Regularization. EMNLP (1) 2021: 3832-3845 - [c214]Jie Hao, Linfeng Song, Liwei Wang, Kun Xu, Zhaopeng Tu, Dong Yu:
RAST: Domain-Robust Dialogue Rewriting as Sequence Tagging. EMNLP (1) 2021: 4913-4924 - [c213]Lifeng Jin, Linfeng Song, Kun Xu, Dong Yu:
Instance-adaptive training with noise-robust losses against noisy labels. EMNLP (1) 2021: 5647-5663 - [c212]Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu:
Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories. EMNLP (1) 2021: 7741-7751 - [c211]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Contrastive Separative Coding for Self-Supervised Representation Learning. ICASSP 2021: 3865-3869 - [c210]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Sandglasset: A Light Multi-Granularity Self-Attentive Network for Time-Domain Speech Separation. ICASSP 2021: 5759-5763 - [c209]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation. ICASSP 2021: 6089-6093 - [c208]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2Net Architecture. ICASSP 2021: 6354-6358 - [c207]Chunlei Zhang, Meng Yu, Chao Weng, Dong Yu:
Towards Robust Speaker Verification with Target Speaker Enhancement. ICASSP 2021: 6693-6697 - [c206]Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu:
Self-Supervised Text-Independent Speaker Verification Using Prototypical Momentum Contrastive Learning. ICASSP 2021: 6723-6727 - [c205]Liqiang He, Dan Su, Dong Yu:
Learned Transferable Architectures Can Surpass Hand-Designed Architectures for Large Scale Speech Recognition. ICASSP 2021: 6788-6792 - [c204]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation. ICASSP 2021: 6908-6912 - [c203]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. ICASSP 2021: 8433-8437 - [c202]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. Interspeech 2021: 316-320 - [c201]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. Interspeech 2021: 1109-1113 - [c200]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation. Interspeech 2021: 1119-1123 - [c199]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts. Interspeech 2021: 2077-2081 - [c198]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. Interspeech 2021: 2142-2146 - [c197]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation. Interspeech 2021: 3076-3080 - [c196]Saurabh Kataria, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Speaker Verification for Single and Multi-Talker Speech. Interspeech 2021: 4608-4612 - [c195]Yuewen Cao, Songxiang Liu, Shiyin Kang, Na Hu, Peng Liu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Exploring Cross-lingual Singing Voice Synthesis Using Speech Data. ISCSLP 2021: 1-5 - [c194]Songyang Zhang, Linfeng Song, Lifeng Jin, Kun Xu, Dong Yu, Jiebo Luo:
Video-aided Unsupervised Grammar Induction. NAACL-HLT 2021: 1513-1524 - [c193]Lifeng Jin, Kun Xu, Linfeng Song, Dong Yu:
Distant Finetuning with Discourse Relations for Stance Classification. NLPCC (2) 2021: 484-495 - [c192]Jianming Liu, Meng Yu, Yong Xu, Chao Weng, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising. SLT 2021: 766-770 - [c191]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks. SLT 2021: 801-808 - [c190]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. SLT 2021: 817-824 - [i111]Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu:
Generalized RNN beamformer for target speech separation. CoRR abs/2101.01280 (2021) - [i110]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks. CoRR abs/2101.05014 (2021) - [i109]Dian Yu, Kai Sun, Dong Yu, Claire Cardie:
Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question Answering Data. CoRR abs/2102.01226 (2021) - [i108]Linfeng Song, Ante Wang, Jinsong Su, Yue Zhang, Kun Xu, Yubin Ge, Dong Yu:
Structural Information Preserving for Graph-to-Text Generation. CoRR abs/2102.06749 (2021) - [i107]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition. CoRR abs/2102.07955 (2021) - [i106]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Contrastive Separative Coding for Self-supervised Representation Learning. CoRR abs/2103.00816 (2021) - [i105]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation. CoRR abs/2103.00819 (2021) - [i104]Jun Wang, Max W. Y. Lam, Dan Su, Dong Yu:
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect. CoRR abs/2103.01461 (2021) - [i103]Xiaoyang Wang, Chen Li, Jianqiao Zhao, Dong Yu:
NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation. CoRR abs/2103.02548 (2021) - [i102]Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. CoRR abs/2103.16849 (2021) - [i101]Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. CoRR abs/2104.01227 (2021) - [i100]Songyang Zhang, Linfeng Song, Lifeng Jin, Kun Xu, Dong Yu, Jiebo Luo:
Video-aided Unsupervised Grammar Induction. CoRR abs/2104.04369 (2021) - [i99]Kun Xu, Han Wu, Linfeng Song, Haisong Zhang, Linqi Song, Dong Yu:
Conversational Semantic Role Labeling. CoRR abs/2104.04947 (2021) - [i98]Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation. CoRR abs/2104.08450 (2021) - [i97]Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu:
Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain. CoRR abs/2104.12359 (2021) - [i96]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts. CoRR abs/2105.03036 (2021) - [i95]Liqiang He, Shulin Feng, Dan Su, Dong Yu:
Latency-Controlled Neural Architecture Search for Streaming Speech Recognition. CoRR abs/2105.03643 (2021) - [i94]Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. CoRR abs/2106.04275 (2021) - [i93]Max W. Y. Lam, Jun Wang, Rongjie Huang, Dan Su, Dong Yu:
Bilateral Denoising Diffusion Models. CoRR abs/2108.11514 (2021) - [i92]Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis. CoRR abs/2109.03439 (2021) - [i91]Xintong Yu, Hongming Zhang, Yangqiu Song, Changshui Zhang, Kun Xu, Dong Yu:
Exophoric Pronoun Resolution in Dialogues with Topic Regularization. CoRR abs/2109.04787 (2021) - [i90]Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
FAST-RIR: Fast neural diffuse room impulse response generator. CoRR abs/2110.04057 (2021) - [i89]Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu:
Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories. CoRR abs/2110.14091 (2021) - [i88]Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Joint AEC AND Beamforming with Double-Talk Detection using RNN-Transformer. CoRR abs/2111.04904 (2021) - [i87]Songxiang Liu, Dan Su, Dong Yu:
Meta-Voice: Fast few-shot style transfer for expressive voice cloning using meta learning. CoRR abs/2111.07218 (2021) - [i86]Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. CoRR abs/2111.11023 (2021) - [i85]Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE2: Mixture-of-Experts Model with Improved Routing. CoRR abs/2111.11831 (2021) - [i84]Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. CoRR abs/2111.15016 (2021) - [i83]Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI. CoRR abs/2112.02498 (2021) - 2020
- [j48]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-Modal Multi-Channel Target Speech Separation. IEEE J. Sel. Top. Signal Process. 14(3): 530-541 (2020) - [j47]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network. IEEE J. Sel. Top. Signal Process. 14(3): 542-553 (2020) - [j46]Shan Yang, Heng Lu, Shiyin Kang, Liumeng Xue, Jinba Xiao, Dan Su, Lei Xie, Dong Yu:
On the localness modeling for the self-attention based end-to-end speech synthesis. Neural Networks 125: 121-130 (2020) - [j45]Kai Sun, Dian Yu, Dong Yu, Claire Cardie:
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension. Trans. Assoc. Comput. Linguistics 8: 141-155 (2020) - [j44]Weiwei Lin, Man-Wai Mak, Na Li, Dan Su, Dong Yu:
A Framework for Adapting DNN Speaker Embedding Across Languages. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2810-2822 (2020) - [c189]Lifeng Jin, Linfeng Song, Yue Zhang, Kun Xu, Wei-Yun Ma, Dong Yu:
Relation Extraction Exploiting Full Dependency Forests. AAAI 2020: 8034-8041 - [c188]Kaiqiang Song, Logan Lebanoff, Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Chen Li, Dong Yu, Fei Liu:
Joint Parsing and Generation for Abstractive Summarization. AAAI 2020: 8894-8901 - [c187]Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu:
Coordinated Reasoning for Cross-Lingual Knowledge Graph Alignment. AAAI 2020: 9354-9361 - [c186]Zhenyi Wang, Xiaoyang Wang, Bang An, Dong Yu, Changyou Chen:
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints. ACL 2020: 1072-1086 - [c185]Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal:
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning. ACL 2020: 2603-2614 - [c184]Dian Yu, Kai Sun, Claire Cardie, Dong Yu:
Dialogue-Based Relation Extraction. ACL 2020: 4927-4940 - [c183]Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, Dong Yu:
ZPR2: Joint Zero Pronoun Recovery and Resolution using Multi-Task Learning and BERT. ACL 2020: 5429-5434 - [c182]Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu:
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension. ACL 2020: 6751-6761 - [c181]Linfeng Song, Ante Wang, Jinsong Su, Yue Zhang, Kun Xu, Yubin Ge, Dong Yu:
Structural Information Preserving for Graph-to-Text Generation. ACL 2020: 7987-7998 - [c180]Qiao Tian, Zewang Zhang, Ling-Hui Chen, Heng Lu, Chengzhu Yu, Chao Weng, Dong Yu:
The Tencent speech synthesis system for Blizzard Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c179]Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, Yin Li:
Comprehensive Image Captioning via Scene Graph Decomposition. ECCV (14) 2020: 211-229 - [c178]Sangwoo Cho, Kaiqiang Song, Chen Li, Dong Yu, Hassan Foroosh, Fei Liu:
Better Highlighting: Creating Sub-Sentence Summary Highlights. EMNLP (1) 2020: 6282-6300 - [c177]Kun Xu, Haochen Tan, Linfeng Song, Han Wu, Haisong Zhang, Linqi Song, Dong Yu:
Semantic Role Labeling Guided Multi-turn Dialogue ReWriter. EMNLP (1) 2020: 6632-6639 - [c176]Songxiang Liu, Disong Wang, Yuewen Cao, Lifa Sun, Xixin Wu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
End-To-End Accent Conversion Without Using Native Utterances. ICASSP 2020: 6289-6293 - [c175]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Mixup-breakdown: A Consistency Training Method for Improving Generalization of Speech Separation Models. ICASSP 2020: 6374-6378 - [c174]Weiwei Lin, Man-Wai Mak, Na Li, Dan Su, Dong Yu:
Multi-Level Deep Neural Network Adaptation for Speaker Verification Using MMD and Consistency Regularization. ICASSP 2020: 6839-6843 - [c173]Zhenyu Tang, Lianwu Chen, Bo Wu, Dong Yu, Dinesh Manocha:
Improving Reverberant Speech Training Using Diffuse Acoustic Simulation. ICASSP 2020: 6969-6973 - [c172]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset. ICASSP 2020: 6984-6988 - [c171]Xuan Ji, Meng Yu, Chunlei Zhang, Dan Su, Tao Yu, Xiaoyu Liu, Dong Yu:
Speaker-Aware Target Speaker Enhancement by Jointly Learning with Speaker Embedding Extraction. ICASSP 2020: 7294-7298 - [c170]Aswin Shanmugam Subramanian, Chao Weng, Meng Yu, Shi-Xiong Zhang, Yong Xu, Shinji Watanabe, Dong Yu:
Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives. ICASSP 2020: 7299-7303 - [c169]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning. ICASSP 2020: 7319-7323 - [c168]Xuan Ji, Meng Yu, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu:
Integration of Multi-Look Beamformers for Multi-Channel Keyword Spotting. ICASSP 2020: 7464-7468 - [c167]Yuewen Cao, Songxiang Liu, Xixin Wu, Shiyin Kang, Peng Liu, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora. ICASSP 2020: 7619-7623 - [c166]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
Dfsmn-San with Persistent Memory Model for Automatic Speech Recognition. ICASSP 2020: 7704-7708 - [c165]Chengqi Deng, Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu:
Pitchnet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network. ICASSP 2020: 7749-7753 - [c164]Yiheng Huang, Jinchuan Tian, Lei Han, Guangsen Wang, Xingcheng Song, Dan Su, Dong Yu:
A Random Gossip BMUF Process for Neural Language Modeling. ICASSP 2020: 7959-7963 - [c163]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. INTERSPEECH 2020: 56-60 - [c162]Meng Yu, Xuan Ji, Bo Wu, Dan Su, Dong Yu:
End-to-End Multi-Look Keyword Spotting. INTERSPEECH 2020: 66-70 - [c161]Chao Weng, Chengzhu Yu, Jia Cui, Chunlei Zhang, Dong Yu:
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition. INTERSPEECH 2020: 966-970 - [c160]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Peking Opera Synthesis via Duration Informed Attention Network. INTERSPEECH 2020: 1226-1230 - [c159]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Chunlei Zhang, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
DurIAN-SC: Duration Informed Attention Network Based Singing Voice Conversion System. INTERSPEECH 2020: 1231-1235 - [c158]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. INTERSPEECH 2020: 1540-1544 - [c157]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network for Speech Synthesis. INTERSPEECH 2020: 2027-2031 - [c156]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Recognition of Overlapped Speech. INTERSPEECH 2020: 3496-3500 - [c155]Songxiang Liu, Yuewen Cao, Shiyin Kang, Na Hu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Transferring Source Style in Non-Parallel Voice Conversion. INTERSPEECH 2020: 4721-4725 - [c154]Dong Yu:
Building Digital Human. ACM Multimedia 2020: 4801 - [c153]Kaiqiang Song, Fei Liu, Chen Li, Xiaoyang Wang, Dong Yu:
Automatic Summarization of Open-Domain Podcast Episodes. TREC 2020 - [i82]Jianwei Yu, Shi-Xiong Zhang, Jian Wu, Shahram Ghorbani, Bo Wu, Shiyin Kang, Shansong Liu, Xunying Liu, Helen Meng, Dong Yu:
Audio-visual Recognition of Overlapped speech for the LRS2 dataset. CoRR abs/2001.01656 (2020) - [i81]Hongming Zhang, Jiaxin Bai, Yan Song, Kun Xu, Changlong Yu, Yangqiu Song, Wilfred Ng, Dong Yu:
Multiplex Word Embeddings for Selectional Preference Acquisition. CoRR abs/2001.02836 (2020) - [i80]Kun Xu, Linfeng Song, Yansong Feng, Yan Song, Dong Yu:
Coordinated Reasoning for Cross-Lingual Knowledge Graph Alignment. CoRR abs/2001.08728 (2020) - [i79]Mutian He, Yangqiu Song, Kun Xu, Dong Yu:
On the Role of Conceptualization in Commonsense Knowledge Graph Construction. CoRR abs/2003.03239 (2020) - [i78]Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning. CoRR abs/2003.03927 (2020) - [i77]Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Lianwu Chen, Yuexian Zou, Dong Yu:
Multi-modal Multi-channel Target Speech Separation. CoRR abs/2003.07032 (2020) - [i76]Dian Yu, Kai Sun, Claire Cardie, Dong Yu:
Dialogue-Based Relation Extraction. CoRR abs/2004.08056 (2020) - [i75]Zhenyi Wang, Xiaoyang Wang, Bang An, Dong Yu, Changyou Chen:
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints. CoRR abs/2005.00969 (2020) - [i74]Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Chao Weng, Jianming Liu, Dong Yu:
Neural Spatio-Temporal Beamformer for Target Speech Separation. CoRR abs/2005.03889 (2020) - [i73]Jie Lei, Liwei Wang, Yelong Shen, Dong Yu, Tamara L. Berg, Mohit Bansal:
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning. CoRR abs/2005.05402 (2020) - [i72]Hongyu Gong, Yelong Shen, Dian Yu, Jianshu Chen, Dong Yu:
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension. CoRR abs/2005.08056 (2020) - [i71]Jianwei Yu, Bo Wu, Rongzhi Gu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Dong Yu, Xunying Liu, Helen Meng:
Audio-visual Multi-channel Recognition of Overlapped Speech. CoRR abs/2005.08571 (2020) - [i70]Meng Yu, Xuan Ji, Bo Wu, Dan Su, Dong Yu:
End-to-End Multi-Look Keyword Spotting. CoRR abs/2005.10386 (2020) - [i69]Xu Li, Na Li, Jinghua Zhong, Xixin Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification. CoRR abs/2006.06186 (2020) - [i68]Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng:
Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams. CoRR abs/2006.11610 (2020) - [i67]Liwei Wang, Jing Huang, Yin Li, Kun Xu, Zhengyuan Yang, Dong Yu:
Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation. CoRR abs/2007.01951 (2020) - [i66]Yiwu Zhong, Liwei Wang, Jianshu Chen, Dong Yu, Yin Li:
Comprehensive Image Captioning via Scene Graph Decomposition. CoRR abs/2007.11731 (2020) - [i65]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Chunlei Zhang, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
DurIAN-SC: Duration Informed Attention Network based Singing Voice Conversion System. CoRR abs/2008.03009 (2020) - [i64]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Peking Opera Synthesis via Duration Informed Attention Network. CoRR abs/2008.03029 (2020) - [i63]Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen:
An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. CoRR abs/2008.09586 (2020) - [i62]Liqiang He, Dan Su, Dong Yu:
Learned Transferable Architectures Can Surpass Hand-Designed Architectures for Large Scale Speech Recognition. CoRR abs/2008.11589 (2020) - [i61]Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Claire Cardie:
Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge. CoRR abs/2009.05831 (2020) - [i60]Kun Xu, Haochen Tan, Linfeng Song, Han Wu, Haisong Zhang, Linqi Song, Dong Yu:
Semantic Role Labeling Guided Multi-turn Dialogue ReWriter. CoRR abs/2010.01417 (2020) - [i59]Xiangkai Lin, Yajing Chen, Linchao Bao, Haoxian Zhang, Sheng Wang, Xuefei Zhe, Xinwei Jiang, Jue Wang, Dong Yu, Zhengyou Zhang:
High-Fidelity 3D Digital Human Creation from RGB-D Selfies. CoRR abs/2010.05562 (2020) - [i58]Sangwoo Cho, Kaiqiang Song, Chen Li, Dong Yu, Hassan Foroosh, Fei Liu:
Better Highlighting: Creating Sub-Sentence Summary Highlights. CoRR abs/2010.10566 (2020) - [i57]Xu Li, Na Li, Chao Weng, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Replay and Synthetic Speech Detection with Res2net Architecture. CoRR abs/2010.15006 (2020) - [i56]Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu:
Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization. CoRR abs/2011.00091 (2020) - [i55]Kaiqiang Song, Chen Li, Xiaoyang Wang, Dong Yu, Zhe Feng:
Automatic Summarization of Open-Domain Podcast Episodes. CoRR abs/2011.04132 (2020) - [i54]Zhaoheng Ni, Yong Xu, Meng Yu, Bo Wu, Shi-Xiong Zhang, Dong Yu, Michael I. Mandel:
WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation. CoRR abs/2011.09162 (2020) - [i53]Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu:
Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation. CoRR abs/2011.13393 (2020) - [i52]Haohan Guo, Heng Lu, Na Hu, Chunlei Zhang, Shan Yang, Lei Xie, Dan Su, Dong Yu:
Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training. CoRR abs/2012.01837 (2020) - [i51]Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu:
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning. CoRR abs/2012.07178 (2020) - [i50]Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Donald S. Williamson, Dong Yu:
Multi-channel Multi-frame ADL-MVDR for Target Speech Separation. CoRR abs/2012.13442 (2020) - [i49]Jie Hao, Linfeng Song, Liwei Wang, Kun Xu, Zhaopeng Tu, Dong Yu:
Robust Dialogue Utterance Rewriting as Sequence Tagging. CoRR abs/2012.14535 (2020) - [i48]Haisong Zhang, Lemao Liu, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Jianchen Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi:
TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis. CoRR abs/2012.15639 (2020)
2010 – 2019
- 2019
- [j43]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 20(3): 438 (2019) - [j42]Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie:
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension. Trans. Assoc. Comput. Linguistics 7: 217-231 (2019) - [c152]Ying Lin, Liyuan Liu, Heng Ji, Dong Yu, Jiawei Han:
Reliability-aware Dynamic Feature Composition for Name Tagging. ACL (1) 2019: 165-174 - [c151]Hongming Zhang, Yan Song, Yangqiu Song, Dong Yu:
Knowledge-aware Pronoun Coreference Resolution. ACL (1) 2019: 867-876 - [c150]Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu:
Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network. ACL (1) 2019: 3156-3161 - [c149]Xiaoman Pan, Kai Sun, Dian Yu, Jianshu Chen, Heng Ji, Claire Cardie, Dong Yu:
Improving Question Answering with External Knowledge. MRQA@EMNLP 2019: 27-37 - [c148]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Prosodic Structure Prediction using Deep Self-attention Neural Network. APSIPA 2019: 320-324 - [c147]Rongzhi Gu, Junyi Peng, Yuexian Zou, Dong Yu:
Alleviate Cross-chunk Permutation through Chunk-level Speaker Embedding for Blind Speech Separation. APSIPA 2019: 325-331 - [c146]Yao Du, Zhiyong Wu, Shiyin Kang, Dan Su, Dong Yu, Helen Meng:
Automatic Prosodic Structure Labeling using DNN-BGRU-CRF Hybrid Neural Network. APSIPA 2019: 1234-1238 - [c145]Junyi Peng, Yuexian Zou, Na Li, Deyi Tuo, Dan Su, Meng Yu, Chunlei Zhang, Dong Yu:
Syllable-Dependent Discriminative Learning for Small Footprint Text-Dependent Speaker Verification. ASRU 2019: 350-357 - [c144]Bo Wu, Meng Yu, Lianwu Chen, Mingjie Jin, Dan Su, Dong Yu:
Improving Speech Enhancement with Phonetic Embedding Features. ASRU 2019: 645-651 - [c143]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. ASRU 2019: 667-673 - [c142]Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu:
Improving Pre-Trained Multilingual Model with Vocabulary Expansion. CoNLL 2019: 316-327 - [c141]Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu, David A. McAllester, Dan Roth:
Evidence Sentence Extraction for Machine Reading Comprehension. CoNLL 2019: 696-707 - [c140]Hongming Zhang, Jiaxin Bai, Yan Song, Kun Xu, Changlong Yu, Yangqiu Song, Wilfred Ng, Dong Yu:
Multiplex Word Embeddings for Selectional Preference Acquisition. EMNLP/IJCNLP (1) 2019: 5246-5255 - [c139]Lianwu Chen, Meng Yu, Dan Su, Dong Yu:
Multi-band PIT and Model Integration for Improved Multi-channel Speech Separation. ICASSP 2019: 705-709 - [c138]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Component Fusion: Learning Replaceable Language Model Component for End-to-end Speech Recognition System. ICASSP 2019: 5631-5635 - [c137]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition Using Deep Polynomial Networks. ICASSP 2019: 5691-5695 - [c136]Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu:
Learning Discriminative Features in Sequence Training without Requiring Framewise Labelled Data. ICASSP 2019: 5696-5700 - [c135]Changhao Shan, Chao Weng, Guangsen Wang, Dan Su, Min Luo, Dong Yu, Lei Xie:
Investigating End-to-end Speech Recognition for Mandarin-english Code-switching. ICASSP 2019: 6056-6060 - [c134]Yichi Zhang, Meng Yu, Na Li, Chengzhu Yu, Jia Cui, Dong Yu:
Seq2Seq Attentional Siamese Neural Networks for Text-dependent Speaker Verification. ICASSP 2019: 6131-6135 - [c133]Peidong Wang, Jia Cui, Chao Weng, Dong Yu:
Token-wise Training for Attention Based End-to-end Speech Recognition. ICASSP 2019: 6276-6280 - [c132]Rongjin Li, Na Li, Deyi Tuo, Meng Yu, Dan Su, Dong Yu:
Boundary Discriminative Large Margin Cosine Loss for Text-independent Speaker Verification. ICASSP 2019: 6321-6325 - [c131]Zhao You, Dan Su, Dong Yu:
Teach an All-rounder with Experts in Different Domains. ICASSP 2019: 6425-6429 - [c130]Chao Weng, Dong Yu:
A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-trained Neural Network Acoustic Models. ICASSP 2019: 6430-6434 - [c129]Yong Xu, Chao Weng, Like Hui, Jianming Liu, Meng Yu, Dan Su, Dong Yu:
Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr. ICASSP 2019: 6745-6749 - [c128]Shan Yang, Heng Lu, Shiying Kang, Lei Xie, Dong Yu:
Enhancing Hybrid Self-attention Structure with Relative-position-aware Bias for Speech Synthesis. ICASSP 2019: 6910-6914 - [c127]Mu Wang, Xixin Wu, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Guangzhi Li, Dan Su, Dong Yu, Helen Meng:
Quasi-fully Convolutional Neural Network with Variational Inference for Speech Synthesis. ICASSP 2019: 7060-7064 - [c126]Zhengyuan Yang, Boqing Gong, Liwei Wang, Wenbing Huang, Dong Yu, Jiebo Luo:
A Fast and Accurate One-Stage Approach to Visual Grounding. ICCV 2019: 4682-4692 - [c125]Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu:
Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching. ICLR (Poster) 2019 - [c124]Ling Luo, Xiang Ao, Yan Song, Jinyao Li, Xiaopeng Yang, Qing He, Dong Yu:
Unsupervised Neural Aspect Extraction with Sememes. IJCAI 2019: 5123-5129 - [c123]Peidong Wang, Jia Cui, Chao Weng, Dong Yu:
Large Margin Training for Attention Based End-to-End Speech Recognition. INTERSPEECH 2019: 246-250 - [c122]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. INTERSPEECH 2019: 466-470 - [c121]Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. INTERSPEECH 2019: 2090-2094 - [c120]Max W. Y. Lam, Jun Wang, Xunying Liu, Helen Meng, Dan Su, Dong Yu:
Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition. INTERSPEECH 2019: 2778-2782 - [c119]Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information. INTERSPEECH 2019: 4290-4294 - [c118]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation. INTERSPEECH 2019: 4574-4578 - [c117]Kai Sun, Dian Yu, Dong Yu, Claire Cardie:
Improving Machine Reading Comprehension with General Reading Strategies. NAACL-HLT (1) 2019: 2633-2643 - [i47]Kai Sun, Dian Yu, Jianshu Chen, Dong Yu, Yejin Choi, Claire Cardie:
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading Comprehension. CoRR abs/1902.00164 (2019) - [i46]Xiaoman Pan, Kai Sun, Dian Yu, Heng Ji, Dong Yu:
Improving Question Answering with External Knowledge. CoRR abs/1902.00993 (2019) - [i45]Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu, Dan Roth, David A. McAllester:
Evidence Sentence Extraction for Machine Reading Comprehension. CoRR abs/1902.08852 (2019) - [i44]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Time Domain Audio Visual Speech Separation. CoRR abs/1904.03760 (2019) - [i43]Jian Wu, Yong Xu, Shi-Xiong Zhang, Lianwu Chen, Meng Yu, Lei Xie, Dong Yu:
Improved Speaker-Dependent Separation for CHiME-5 Challenge. CoRR abs/1904.03792 (2019) - [i42]Kai Sun, Dian Yu, Dong Yu, Claire Cardie:
Probing Prior Knowledge Needed in Challenging Chinese Machine Reading Comprehension. CoRR abs/1904.09679 (2019) - [i41]Shi-Xiong Zhang, Yifan Gong, Dong Yu:
Encrypted Speech Recognition using Deep Polynomial Networks. CoRR abs/1905.05605 (2019) - [i40]Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lianwu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu:
End-to-End Multi-Channel Speech Separation. CoRR abs/1905.06286 (2019) - [i39]Jun Wang, Dan Su, Jie Chen, Shulin Feng, Dongpeng Ma, Na Li, Dong Yu:
Learning discriminative features in sequence training without requiring framewise labelled data. CoRR abs/1905.06907 (2019) - [i38]Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu:
A comprehensive study of speech separation: spectrogram vs waveform separation. CoRR abs/1905.07497 (2019) - [i37]Kun Xu, Liwei Wang, Mo Yu, Yansong Feng, Yan Song, Zhiguo Wang, Dong Yu:
Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network. CoRR abs/1905.11605 (2019) - [i36]Guoyin Wang, Yan Song, Yue Zhang, Dong Yu:
Learning Word Embeddings with Domain Awareness. CoRR abs/1906.03249 (2019) - [i35]Hongming Zhang, Yan Song, Yangqiu Song, Dong Yu:
Knowledge-aware Pronoun Coreference Resolution. CoRR abs/1907.03663 (2019) - [i34]Zhenyu Tang, Lianwu Chen, Bo Wu, Dong Yu, Dinesh Manocha:
Improving Reverberant Speech Training Using Diffuse Acoustic Simulation. CoRR abs/1907.03988 (2019) - [i33]Zhao You, Dan Su, Dong Yu:
Teach an all-rounder with experts in different domains. CoRR abs/1907.05698 (2019) - [i32]Zhengyuan Yang, Boqing Gong, Liwei Wang, Wenbing Huang, Dong Yu, Jiebo Luo:
A Fast and Accurate One-Stage Approach to Visual Grounding. CoRR abs/1908.06354 (2019) - [i31]Peng Liu, Xixin Wu, Shiyin Kang, Guangzhi Li, Dan Su, Dong Yu:
Maximizing Mutual Information for Tacotron. CoRR abs/1909.01145 (2019) - [i30]Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu:
DurIAN: Duration Informed Attention Network For Multimodal Synthesis. CoRR abs/1909.01700 (2019) - [i29]Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu:
Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network. CoRR abs/1909.07352 (2019) - [i28]Yiheng Huang, Jinchuan Tian, Lei Han, Guangsen Wang, Xingcheng Song, Dan Su, Dong Yu:
A Random Gossip BMUF Process for Neural Language Modeling. CoRR abs/1909.09010 (2019) - [i27]Hai Wang, Dian Yu, Kai Sun, Jianshu Chen, Dong Yu:
Improving Pre-Trained Multilingual Models with Vocabulary Expansion. CoRR abs/1909.12440 (2019) - [i26]Xingcheng Song, Guangsen Wang, Zhiyong Wu, Yiheng Huang, Dan Su, Dong Yu, Helen Meng:
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks. CoRR abs/1910.10387 (2019) - [i25]Sangwoo Cho, Chen Li, Dong Yu, Hassan Foroosh, Fei Liu:
Multi-Document Summarization with Determinantal Point Processes and Contextualized Representations. CoRR abs/1910.11411 (2019) - [i24]Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Mixup-breakdown: a consistency training method for improving generalization of speech separation models. CoRR abs/1910.13253 (2019) - [i23]Zhao You, Dan Su, Jie Chen, Chao Weng, Dong Yu:
DFSMN-SAN with Persistent Memory Model for Automatic Speech Recognition. CoRR abs/1910.13282 (2019) - [i22]Kaiqiang Song, Logan Lebanoff, Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Chen Li, Dong Yu, Fei Liu:
Joint Parsing and Generation for Abstractive Summarization. CoRR abs/1911.10389 (2019) - [i21]Chao Weng, Chengzhu Yu, Jia Cui, Chunlei Zhang, Dong Yu:
Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition. CoRR abs/1911.12487 (2019) - [i20]Chengqi Deng, Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu:
PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network. CoRR abs/1912.01852 (2019) - [i19]Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu:
A Unified Framework for Speech Separation. CoRR abs/1912.07814 (2019) - [i18]Liqiang Zhang, Chengzhu Yu, Heng Lu, Chao Weng, Yusong Wu, Xiang Xie, Zijin Li, Dong Yu:
Learning Singing From Speech. CoRR abs/1912.10128 (2019) - [i17]Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu:
Synthesising Expressiveness in Peking Opera via Duration Informed Attention Network. CoRR abs/1912.12010 (2019) - 2018
- [j41]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(1): 40-63 (2018) - [j40]Yanmin Qian, Chao Weng, Xuankai Chang, Shuai Wang, Dong Yu:
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem. Frontiers Inf. Technol. Electron. Eng. 19(4): 582 (2018) - [j39]Yanmin Qian, Xuankai Chang, Dong Yu:
Single-channel multi-talker speech recognition with permutation invariant training. Speech Commun. 104: 1-11 (2018) - [c116]Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang:
XL-NBT: A Cross-lingual Neural Belief Tracking Framework. EMNLP 2018: 414-424 - [c115]Tian Tan, Yanmin Qian, Dong Yu:
Knowledge Transfer in Permutation Invariant Training for Single-Channel Multi-Talker Speech Recognition. ICASSP 2018: 571-5718 - [c114]Xuankai Chang, Yanmin Qian, Dong Yu:
Adaptive Permutation Invariant Training with Auxiliary Information for Monaural Multi-Talker Speech Recognition. ICASSP 2018: 5974-5978 - [c113]Lianwu Chen, Meng Yu, Yanmin Qian, Dan Su, Dong Yu:
Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation. INTERSPEECH 2018: 302-306 - [c112]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures. INTERSPEECH 2018: 307-311 - [c111]Chao Weng, Jia Cui, Guangsen Wang, Jun Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition. INTERSPEECH 2018: 761-765 - [c110]Chengzhu Yu, Chunlei Zhang, Chao Weng, Jia Cui, Dong Yu:
A Multistage Training Framework for Acoustic-to-Word Model. INTERSPEECH 2018: 786-790 - [c109]Xuankai Chang, Yanmin Qian, Dong Yu:
Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks. INTERSPEECH 2018: 1586-1590 - [c108]Na Li, Deyi Tuo, Dan Su, Zhifeng Li, Dong Yu:
Deep Discriminative Embeddings for Duration Robust Speaker Verification. INTERSPEECH 2018: 2262-2266 - [c107]Meng Yu, Xuan Ji, Yi Gao, Lianwu Chen, Jie Chen, Jimeng Zheng, Dan Su, Dong Yu:
Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection. INTERSPEECH 2018: 2613-2617 - [c106]Xixin Wu, Yuewen Cao, Mu Wang, Songxiang Liu, Shiyin Kang, Zhiyong Wu, Xunying Liu, Dan Su, Dong Yu, Helen Meng:
Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis. INTERSPEECH 2018: 3072-3076 - [c105]Mu Wang, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng:
Speech Super-Resolution Using Parallel WaveNet. ISCSLP 2018: 260-264 - [c104]Chunlei Zhang, Chengzhu Yu, Chao Weng, Jia Cui, Dong Yu:
An Exploration of Directly Using Word as ACOUSTIC Modeling Unit for Speech Recognition. SLT 2018: 64-69 - [c103]Jia Cui, Chao Weng, Guangsen Wang, Jun Wang, Peidong Wang, Chengzhu Yu, Dan Su, Dong Yu:
Improving Attention-Based End-to-End ASR Systems with Sequence-Based Loss Functions. SLT 2018: 353-360 - [i16]Dong Yu, Jinyu Li:
Recent Progresses in Deep Learning based Acoustic Models (Updated). CoRR abs/1804.09298 (2018) - [i15]Jun Wang, Jie Chen, Dan Su, Lianwu Chen, Meng Yu, Yanmin Qian, Dong Yu:
Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures. CoRR abs/1807.08974 (2018) - [i14]Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan, William Yang Wang:
XL-NBT: A Cross-lingual Neural Belief Tracking Framework. CoRR abs/1808.06244 (2018) - [i13]Kai Sun, Dian Yu, Dong Yu, Claire Cardie:
Improving Machine Reading Comprehension with General Reading Strategies. CoRR abs/1810.13441 (2018) - [i12]Chao Weng, Dong Yu:
A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models. CoRR abs/1811.03700 (2018) - [i11]Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu:
Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching. CoRR abs/1812.09323 (2018) - 2017
- [j38]Dong Yu, Jinyu Li:
Recent progresses in deep learning based acoustic models. IEEE CAA J. Autom. Sinica 4(3): 396-409 (2017) - [j37]Morten Kolbaek, Dong Yu, Zheng-Hua Tan, Jesper Jensen:
Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 25(10): 1901-1913 (2017) - [j36]Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Michael L. Seltzer, Andreas Stolcke, Dong Yu, Geoffrey Zweig:
Toward Human Parity in Conversational Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 25(12): 2410-2423 (2017) - [c102]Dong Yu, Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen:
Permutation invariant training of deep models for speaker-independent multi-talker speech separation. ICASSP 2017: 241-245 - [c101]Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Mike Seltzer, Andreas Stolcke, Dong Yu, Geoffrey Zweig:
The microsoft 2016 conversational speech recognition system. ICASSP 2017: 5255-5259 - [c100]Wenpeng Li, Binbin Zhang, Lei Xie, Dong Yu:
Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling. INTERSPEECH 2017: 528-532 - [c99]Dong Yu, Xuankai Chang, Yanmin Qian:
Recognizing Multi-Talker Speech with Permutation Invariant Training. INTERSPEECH 2017: 2456-2460 - [c98]Morten Kolbaek, Dong Yu, Zheng-Hua Tan, Jesper Jensen:
Joint separation and denoising of noisy multi-talker speech using recurrent neural networks and permutation invariant training. MLSP 2017: 1-6 - [p3]Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael I. Mandel, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu:
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 79-104 - [p2]Yu Zhang, Dong Yu, Guoguo Chen:
Advanced Recurrent Neural Networks for Automatic Speech Recognition. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 261-279 - [p1]Guoguo Chen, Yu Zhang, Dong Yu:
Sequence-Discriminative Training of Neural Networks. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 281-297 - [i10]Wenpeng Li, Binbin Zhang, Lei Xie, Dong Yu:
Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling. CoRR abs/1703.05880 (2017) - [i9]Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Jesper Jensen:
Multi-talker Speech Separation and Tracing with Permutation Invariant Training of Deep Recurrent Neural Networks. CoRR abs/1703.06284 (2017) - [i8]Dong Yu, Xuankai Chang, Yanmin Qian:
Recognizing Multi-talker Speech with Permutation Invariant Training. CoRR abs/1704.01985 (2017) - [i7]Yanmin Qian, Xuankai Chang, Dong Yu:
Single-Channel Multi-talker Speech Recognition with Permutation Invariant Training. CoRR abs/1707.06527 (2017) - [i6]Morten Kolbæk, Dong Yu, Zheng-Hua Tan, Jesper Jensen:
Joint Separation and Denoising of Noisy Multi-talker Speech using Recurrent Neural Networks and Permutation Invariant Training. CoRR abs/1708.09588 (2017) - 2016
- [j35]Yanmin Qian, Tian Tan, Dong Yu:
Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2231-2240 (2016) - [c97]Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai Sim, Xiong Xiao, Yu Zhang:
Speaker-aware training of LSTM-RNNS for acoustic modelling. ICASSP 2016: 5280-5284 - [c96]Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-adaptation-correction recurrent neural networks for low-resource language speech recognition. ICASSP 2016: 5415-5419 - [c95]Yanmin Qian, Tian Tan, Dong Yu:
An investigation into using parallel data for far-field speech recognition. ICASSP 2016: 5725-5729 - [c94]Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John R. Hershey, Michael L. Seltzer, Guoguo Chen, Yu Zhang, Michael I. Mandel, Dong Yu:
Deep beamforming networks for multi-channel speech recognition. ICASSP 2016: 5745-5749 - [c93]Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway long short-term memory RNNS for distant speech recognition. ICASSP 2016: 5755-5759 - [c92]Yanmin Qian, Tian Tan, Dong Yu, Yu Zhang:
Integrated adaptation with multi-factor joint-learning for far-field speech recognition. ICASSP 2016: 5770-5774 - [c91]Dong Yu, Wayne Xiong, Jasha Droppo, Andreas Stolcke, Guoli Ye, Jinyu Li, Geoffrey Zweig:
Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention. INTERSPEECH 2016: 17-21 - [c90]Yangyang Shi, Kaisheng Yao, Hu Chen, Dong Yu, Yi-Cheng Pan, Mei-Yuh Hwang:
Recurrent Support Vector Machines For Slot Tagging In Spoken Language Understanding. HLT-NAACL 2016: 393-399 - [i5]Dong Yu, Morten Kolbæk, Zheng-Hua Tan, Jesper Jensen:
Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation. CoRR abs/1607.00325 (2016) - [i4]Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Mike Seltzer, Andreas Stolcke, Dong Yu, Geoffrey Zweig:
The Microsoft 2016 Conversational Speech Recognition System. CoRR abs/1609.03528 (2016) - [i3]Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Mike Seltzer, Andreas Stolcke, Dong Yu, Geoffrey Zweig:
Achieving Human Parity in Conversational Speech Recognition. CoRR abs/1610.05256 (2016) - 2015
- [j34]Dong Yu, Kaisheng Yao, Yu Zhang:
The Computational Network Toolkit [Best of the Web]. IEEE Signal Process. Mag. 32(6): 123-126 (2015) - [j33]Grégoire Mesnil, Yann N. Dauphin, Kaisheng Yao, Yoshua Bengio, Li Deng, Dilek Hakkani-Tür, Xiaodong He, Larry P. Heck, Gökhan Tür, Dong Yu, Geoffrey Zweig:
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding. IEEE ACM Trans. Audio Speech Lang. Process. 23(3): 530-539 (2015) - [j32]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 23(10): 1670-1679 (2015) - [c89]Abdel-rahman Mohamed, Frank Seide, Dong Yu, Jasha Droppo, Andreas Stolcke, Geoffrey Zweig, Gerald Penn:
Deep bi-directional recurrent networks over spectral windows. ASRU 2015: 78-83 - [c88]Yu Zhang, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Speech recognition with prediction-adaptation-correction recurrent neural networks. ICASSP 2015: 5004-5008 - [c87]Ritwik Giri, Michael L. Seltzer, Jasha Droppo, Dong Yu:
Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning. ICASSP 2015: 5014-5018 - [i2]Yu Zhang, Guoguo Chen, Dong Yu, Kaisheng Yao, Sanjeev Khudanpur, James R. Glass:
Highway Long Short-Term Memory RNNs for Distant Speech Recognition. CoRR abs/1510.08983 (2015) - [i1]Yu Zhang, Ekapol Chuangsuwanich, James R. Glass, Dong Yu:
Prediction-Adaptation-Correction Recurrent Neural Networks for Low-Resource Language Speech Recognition. CoRR abs/1510.08985 (2015) - 2014
- [j31]Li Deng, Dong Yu:
Deep Learning: Methods and Applications. Found. Trends Signal Process. 7(3-4): 197-387 (2014) - [j30]Kaisheng Yao, Dong Yu, Li Deng, Yifan Gong:
A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation. Neurocomputing 128: 145-152 (2014) - [j29]Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn, Dong Yu:
Convolutional Neural Networks for Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 22(10): 1533-1545 (2014) - [c86]Frank Seide, Hao Fu, Jasha Droppo, Gang Li, Dong Yu:
On parallelizability of stochastic gradient descent for speech DNNS. ICASSP 2014: 235-239 - [c85]Kaisheng Yao, Baolin Peng, Geoffrey Zweig, Dong Yu, Xiaolong Li, Feng Gao:
Recurrent conditional random field for language understanding. ICASSP 2014: 4077-4081 - [c84]Nicolas Boulanger-Lewandowski, Jasha Droppo, Mike Seltzer, Dong Yu:
Phone sequence modeling with recurrent neural networks. ICASSP 2014: 5417-5421 - [c83]Chao Weng, Dong Yu, Shinji Watanabe, Biing-Hwang Fred Juang:
Recurrent deep neural networks for robust speech recognition. ICASSP 2014: 5532-5536 - [c82]Chao Weng, Dong Yu, Michael L. Seltzer, Jasha Droppo:
Single-channel mixed speech recognition using deep neural networks. ICASSP 2014: 5632-5636 - [c81]Jian Xue, Jinyu Li, Dong Yu, Mike Seltzer, Yifan Gong:
Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network. ICASSP 2014: 6359-6363 - [c80]Kun Han, Dong Yu, Ivan Tashev:
Speech emotion recognition using deep neural network and extreme learning machine. INTERSPEECH 2014: 223-227 - [c79]Frank Seide, Hao Fu, Jasha Droppo, Gang Li, Dong Yu:
1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs. INTERSPEECH 2014: 1058-1062 - [c78]Yan Huang, Dong Yu, Chaojun Liu, Yifan Gong:
A comparative analytic study on the Gaussian mixture and context dependent deep neural network hidden Markov models. INTERSPEECH 2014: 1895-1899 - [c77]Yan Huang, Dong Yu, Chaojun Liu, Yifan Gong:
Multi-accent deep neural network acoustic model with accent-specific top layer using the KLD-regularized model adaptation. INTERSPEECH 2014: 2977-2981 - [c76]Dong Yu, Adam Eversole, Michael L. Seltzer, Kaisheng Yao, Brian Guenter, Oleksii Kuchaiev, Frank Seide, Huaming Wang, Jasha Droppo, Zhiheng Huang, Geoffrey Zweig, Christopher J. Rossbach, Jon Currey:
An introduction to computational networks and the computational network toolkit (invited talk). INTERSPEECH 2014 - [c75]Kaisheng Yao, Baolin Peng, Yu Zhang, Dong Yu, Geoffrey Zweig, Yangyang Shi:
Spoken language understanding using long short-term memory neural networks. SLT 2014: 189-194 - 2013
- [j28]Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee:
Exploiting deep neural networks for detection-based speech recognition. Neurocomputing 106: 148-157 (2013) - [j27]Brian Hutchinson, Li Deng, Dong Yu:
Tensor Deep Stacking Networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8): 1944-1957 (2013) - [j26]Sabato Marco Siniscalchi, Dong Yu, Li Deng, Chin-Hui Lee:
Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model. IEEE Signal Process. Lett. 20(3): 201-204 (2013) - [j25]Dong Yu, Li Deng, Frank Seide:
The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition. IEEE Trans. Speech Audio Process. 21(2): 388-396 (2013) - [j24]Zhen-Hua Ling, Li Deng, Dong Yu:
Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis. IEEE Trans. Speech Audio Process. 21(10): 2129-2139 (2013) - [c74]George E. Dahl, Jack W. Stokes, Li Deng, Dong Yu:
Large-scale malware classification using random projections and neural networks. ICASSP 2013: 3422-3426 - [c73]Hang Su, Gang Li, Dong Yu, Frank Seide:
Error back propagation for sequence training of Context-Dependent Deep NetworkS for conversational speech transcription. ICASSP 2013: 6664-6668 - [c72]Li Deng, Ossama Abdel-Hamid, Dong Yu:
A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. ICASSP 2013: 6669-6673 - [c71]Jui-Ting Huang, Jinyu Li, Dong Yu, Li Deng, Yifan Gong:
Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. ICASSP 2013: 7304-7308 - [c70]Michael L. Seltzer, Dong Yu, Yongqiang Wang:
An investigation of deep neural networks for noise robust speech recognition. ICASSP 2013: 7398-7402 - [c69]Zhen-Hua Ling, Li Deng, Dong Yu:
Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis. ICASSP 2013: 7825-7829 - [c68]Dong Yu, Kaisheng Yao, Hang Su, Gang Li, Frank Seide:
KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition. ICASSP 2013: 7893-7897 - [c67]Li Deng, Jinyu Li, Jui-Ting Huang, Kaisheng Yao, Dong Yu, Frank Seide, Michael L. Seltzer, Geoffrey Zweig, Xiaodong He, Jason D. Williams, Yifan Gong, Alex Acero:
Recent advances in deep learning for speech research at Microsoft. ICASSP 2013: 8604-8608 - [c66]Ossama Abdel-Hamid, Li Deng, Dong Yu, Hui Jiang:
Deep segmental neural networks for speech recognition. INTERSPEECH 2013: 1849-1853 - [c65]Yan Huang, Dong Yu, Yifan Gong, Chaojun Liu:
Semi-supervised GMM and DNN acoustic model training with multi-system combination and confidence re-calibration. INTERSPEECH 2013: 2360-2364 - [c64]Kaisheng Yao, Geoffrey Zweig, Mei-Yuh Hwang, Yangyang Shi, Dong Yu:
Recurrent neural networks for language understanding. INTERSPEECH 2013: 2524-2528 - [c63]Ossama Abdel-Hamid, Li Deng, Dong Yu:
Exploring convolutional neural network structures and optimization techniques for speech recognition. INTERSPEECH 2013: 3366-3370 - [c62]Dong Yu, Michael L. Seltzer, Jinyu Li, Jui-Ting Huang, Frank Seide:
Feature Learning in Deep Neural Networks - A Study on Speech Recognition Tasks. ICLR 2013 - 2012
- [j23]Dong Yu, Li Deng:
Efficient and effective algorithms for training single-hidden-layer neural networks. Pattern Recognit. Lett. 33(5): 554-558 (2012) - [j22]Dong Yu, Geoffrey E. Hinton, Nelson Morgan, Jen-Tzung Chien, Shigeki Sagayama:
Introduction to the Special Section on Deep Learning for Speech and Language Processing. IEEE Trans. Speech Audio Process. 20(1): 4-6 (2012) - [j21]George E. Dahl, Dong Yu, Li Deng, Alex Acero:
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Trans. Speech Audio Process. 20(1): 30-42 (2012) - [c61]Li Deng, Dong Yu, John C. Platt:
Scalable stacking and learning for building deep architectures. ICASSP 2012: 2133-2136 - [c60]Dong Yu, Sabato Marco Siniscalchi, Li Deng, Chin-Hui Lee:
Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition. ICASSP 2012: 4169-4172 - [c59]Dong Yu, Frank Seide, Gang Li, Li Deng:
Exploiting sparseness in deep neural networks for large vocabulary speech recognition. ICASSP 2012: 4409-4412 - [c58]Brian Hutchinson, Li Deng, Dong Yu:
A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition. ICASSP 2012: 4805-4808 - [c57]Dong Yu, Frank Seide, Gang Li:
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. ICML 2012 - [c56]Dong Yu, Li Deng, Frank Seide:
Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks. INTERSPEECH 2012: 6-9 - [c55]Xie Chen, Adam Eversole, Gang Li, Dong Yu, Frank Seide:
Pipelined Back-Propagation for Context-Dependent Deep Neural Networks. INTERSPEECH 2012: 26-29 - [c54]Li Deng, Brian Hutchinson, Dong Yu:
Parallel Training for Deep Stacking Networks. INTERSPEECH 2012: 2598-2601 - [c53]Jinyu Li, Dong Yu, Jui-Ting Huang, Yifan Gong:
Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM. SLT 2012: 131-136 - [c52]Gang Li, Huifeng Zhu, Gong Cheng, Kit Thambiratnam, Behrooz Chitsaz, Dong Yu, Frank Seide:
Context-dependent Deep Neural Networks for audio indexing of real-life data. SLT 2012: 143-148 - [c51]Kaisheng Yao, Dong Yu, Frank Seide, Hang Su, Li Deng, Yifan Gong:
Adaptation of context-dependent deep neural networks for automatic speech recognition. SLT 2012: 366-369 - 2011
- [j20]Dong Yu, Li Deng:
Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP]. IEEE Signal Process. Mag. 28(1): 145-154 (2011) - [j19]Michael L. Seltzer, Yun-Cheng Ju, Ivan Tashev, Ye-Yi Wang, Dong Yu:
In-Car Media Search. IEEE Signal Process. Mag. 28(4): 50-60 (2011) - [j18]Dong Yu, Jinyu Li, Li Deng:
Calibration of Confidence Measures in Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 19(8): 2461-2473 (2011) - [c50]Frank Seide, Gang Li, Xie Chen, Dong Yu:
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription. ASRU 2011: 24-29 - [c49]George E. Dahl, Dong Yu, Li Deng, Alex Acero:
Large vocabulary continuous speech recognition with context-dependent DBN-HMMS. ICASSP 2011: 4688-4691 - [c48]Dong Yu, Michael L. Seltzer:
Improved Bottleneck Features Using Pretrained Deep Neural Networks. INTERSPEECH 2011: 237-240 - [c47]Frank Seide, Gang Li, Dong Yu:
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. INTERSPEECH 2011: 437-440 - [c46]Dong Yu, Li Deng:
Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition. INTERSPEECH 2011: 2281-2284 - [c45]Dong Yu, Li Deng:
Deep Convex Net: A Scalable Architecture for Speech Pattern Classification. INTERSPEECH 2011: 2285-2288 - 2010
- [j17]Dong Yu, Balakrishnan Varadarajan, Li Deng, Alex Acero:
Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion. Comput. Speech Lang. 24(3): 433-444 (2010) - [j16]Dong Yu, Shizhen Wang, Li Deng:
Sequential Labeling Using Deep-Structured Conditional Random Fields. IEEE J. Sel. Top. Signal Process. 4(6): 965-973 (2010) - [c44]Dong Yu, Shizhen Wang, Jinyu Li, Li Deng:
Word confidence calibration using a maximum entropy model with constraints on confidence and word distributions. ICASSP 2010: 4446-4449 - [c43]Dong Yu, Li Deng:
Semantic confidence calibration for spoken dialog applications. ICASSP 2010: 4450-4453 - [c42]Dong Yu, Shizhen Wang, Zahi N. Karam, Li Deng:
Language recognition using deep-structured conditional random fields. ICASSP 2010: 5030-5033 - [c41]Jinyu Li, Dong Yu, Yifan Gong, Li Deng:
Unscented transform with online distortion estimation for HMM adaptation. INTERSPEECH 2010: 1660-1663 - [c40]Li Deng, Michael L. Seltzer, Dong Yu, Alex Acero, Abdel-rahman Mohamed, Geoffrey E. Hinton:
Binary coding of speech spectrograms using a deep auto-encoder. INTERSPEECH 2010: 1692-1695 - [c39]Abdel-rahman Mohamed, Dong Yu, Li Deng:
Investigation of full-sequence training of deep belief networks for speech recognition. INTERSPEECH 2010: 2846-2849 - [c38]Dong Yu, Li Deng:
Deep-structured hidden conditional random fields for phonetic recognition. INTERSPEECH 2010: 2986-2989
2000 – 2009
- 2009
- [j15]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions. Comput. Speech Lang. 23(3): 389-405 (2009) - [j14]Dong Yu, Li Deng, Alex Acero:
Using continuous features in the maximum entropy model. Pattern Recognit. Lett. 30(14): 1295-1300 (2009) - [j13]Dong Yu, Li Deng:
Solving nonlinear estimation problems using splines [Lecture Notes]. IEEE Signal Process. Mag. 26(4): 86-90 (2009) - [j12]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models. IEEE Trans. Speech Audio Process. 17(7): 1348-1360 (2009) - [c37]Dong Yu, Li Deng, Peng Liu, Jian Wu, Yifan Gong, Alex Acero:
Cross-lingual speech recognition under runtime resource constraints. ICASSP 2009: 4193-4196 - [c36]Hui Lin, Li Deng, Dong Yu, Yifan Gong, Alex Acero, Chin-Hui Lee:
A study on multilingual acoustic modeling for large vocabulary ASR. ICASSP 2009: 4333-4336 - [c35]Oriol Vinyals, Li Deng, Dong Yu, Alex Acero:
Discriminative pronounciation learning using phonetic decoder and minimum-classification-error criterion. ICASSP 2009: 4445-4448 - [c34]Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero:
Using collective information in semi-supervised learning for speech recognition. ICASSP 2009: 4633-4636 - [c33]Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex Acero:
Maximizing global entropy reduction for active learning in speech recognition. ICASSP 2009: 4721-4724 - [c32]Dong Yu, Li Deng, Alex Acero:
Hidden conditional random field with distribution constraints for phone classification. INTERSPEECH 2009: 676-679 - 2008
- [j11]Dong Yu, Li Deng, Xiaodong He, Alex Acero:
Large-margin minimum classification error training: A theoretical risk minimization perspective. Comput. Speech Lang. 22(4): 415-429 (2008) - [j10]Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Alex Acero:
An introduction to voice search. IEEE Signal Process. Mag. 25(3): 28-38 (2008) - [j9]Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero:
Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor. IEEE Trans. Speech Audio Process. 16(5): 1061-1070 (2008) - [j8]Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero:
An Integrative and Discriminative Technique for Spoken Utterance Classification. IEEE Trans. Speech Audio Process. 16(6): 1207-1214 (2008) - [c31]Dong Yu, Li Deng, Jasha Droppo, Jian Wu, Yifan Gong, Alex Acero:
A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition. ICASSP 2008: 4041-4044 - [c30]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
HMM adaptation using a phase-sensitive acoustic distortion model for environment-robust speech recognition. ICASSP 2008: 4069-4072 - [c29]Jinyu Li, Li Deng, Dong Yu, Jian Wu, Yifan Gong, Alex Acero:
Adaptation of compressed HMM parameters for resource-constrained speech recognition. ICASSP 2008: 4333-4336 - [c28]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
Discriminative training of variable-parameter HMMs for noise robust speech recognition. INTERSPEECH 2008: 285-288 - [c27]Dong Yu, Li Deng, Yifan Gong, Alex Acero:
Parameter clustering and sharing in variable-parameter HMMs for noise robust speech recognition. INTERSPEECH 2008: 1253-1256 - [c26]Dong Yu, Li Deng, Jian Wu, Yifan Gong, Alex Acero:
Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition. ISCSLP 2008: 69-72 - 2007
- [j7]Dong Yu, Deborah A. Frincke:
Improving the quality of alerts and predicting intruder's next goal with Hidden Colored Petri-Net. Comput. Networks 51(3): 632-654 (2007) - [j6]Dong Yu, Li Deng, Alex Acero:
Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation. Comput. Speech Lang. 21(1): 72-87 (2007) - [c25]Jinyu Li, Li Deng, Dong Yu, Yifan Gong, Alex Acero:
High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series. ASRU 2007: 65-70 - [c24]Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Acero:
A Discriminative Training Framework using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification. ICASSP (4) 2007: 5-8 - [c23]Li Deng, Dong Yu:
Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition. ICASSP (4) 2007: 445-448 - [c22]Dong Yu, Li Deng, Xiaodong He, Alex Acero:
Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks. ICASSP (4) 2007: 1137-1140 - [c21]J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero:
Voicepedia: towards speech-based access to unstructured information. INTERSPEECH 2007: 146-149 - [c20]Dong Yu, Li Deng, Alex Acero:
Handling phonetic context and speaker variation in a structure-based speech recognizer. INTERSPEECH 2007: 906-909 - [c19]Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Geoffrey Zweig, Alex Acero:
Automated directory assistance system - from theory to practice. INTERSPEECH 2007: 2709-2712 - [c18]Geoffrey Zweig, Patrick Nguyen, Yun-Cheng Ju, Ye-Yi Wang, Dong Yu, Alex Acero:
The voice-rate dialog system for consumer ratings. INTERSPEECH 2007: 2713-2716 - [c17]Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, Geoffrey Zweig, Alex Acero:
Confidence measures for voice search applications. INTERSPEECH 2007: 2721-2724 - [c16]Geoffrey Zweig, Yun-Cheng Ju, Patrick Nguyen, Dong Yu, Ye-Yi Wang, Alex Acero:
Voice-Rate: A Dialog System for Consumer Ratings. HLT-NAACL (Demonstrations) 2007: 31-32 - [c15]Dong Yu, Li Deng:
Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition. ICSC 2007: 429-438 - [c14]Ivan Tashev, Michael L. Seltzer, Yun-Cheng Ju, Dong Yu, Alex Acero:
Commute UX: Telephone Dialog System for Location-based Services. SIGdial 2007: 87-94 - 2006
- [j5]Dong Yu, Li Deng, Alex Acero:
A lattice search technique for a long-contextual-span hidden trajectory model of speech. Speech Commun. 48(9): 1214-1226 (2006) - [j4]Li Deng, Dong Yu, Alex Acero:
A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition. IEEE Trans. Speech Audio Process. 14(1): 256-265 (2006) - [j3]Li Deng, Dong Yu, Alex Acero:
Structured speech modeling. IEEE Trans. Speech Audio Process. 14(5): 1492-1504 (2006) - [c13]Dong Yu, Yun-Cheng Ju, Ye-Yi Wang, Alex Acero:
N-Gram Based Filler Model for Robust Grammar Authoring. ICASSP (1) 2006: 565-568 - [c12]Xiaolong Li, Li Deng, Dong Yu, Alex Acero:
A time-synchronous phonetic decoder for a long-contextual-Span hidden trajectory model. INTERSPEECH 2006 - [c11]Dong Yu, Li Deng, Xiaodong He, Alex Acero:
Use of incrementally regulated discriminative margins in MCE training for speech recognition. INTERSPEECH 2006 - [c10]Dong Yu, Yun-Cheng Ju, Alex Acero:
An effective and efficient utterance verification technology using word n-gram filler models. INTERSPEECH 2006 - 2005
- [j2]Dong Yu, Alex Acero:
Semiautomatic Improvements of System-Initiative Spoken Dialog Applications Using Interactive Clustering. IEEE Trans. Speech Audio Process. 13(5-1): 661-671 (2005) - [j1]Li Deng, Dong Yu:
A Speech-Centric Perspective for Human-Computer Interface: A Case Study. J. VLSI Signal Process. 41(3): 255-269 (2005) - [c9]Dong Yu, Deborah A. Frincke:
Alert confidence fusion in intrusion detection systems with extended Dempster-Shafer theory. ACM Southeast Regional Conference (2) 2005: 142-147 - [c8]Li Deng, Xiang Li, Dong Yu, Alex Acero:
A Hidden Trajectory Model with Bi-directional Target-Filtering: Cascaded vs. Integrated Implementation for Phonetic Recognition. ICASSP (1) 2005: 337-340 - [c7]Dong Yu, Milind Mahajan, Peter Mau, Alex Acero:
Maximum Entropy Based Generic Filter for Language Model Adaptation. ICASSP (1) 2005: 597-600 - [c6]Dong Yu, Li Deng, Alex Acero:
Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search. INTERSPEECH 2005: 553-556 - [c5]Li Deng, Dong Yu, Alex Acero:
Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction. INTERSPEECH 2005: 1097-1100 - 2004
- [c4]Dong Yu, Deborah A. Frincke:
A Novel Framework for Alert Correlation and Understanding. ACNS 2004: 452-466 - [c3]Dong Yu, Deborah A. Frincke:
Towards Survivable Intrusion Detection System . HICSS 2004 - [c2]Dong Yu, Mei-Yuh Hwang, Peter Mau, Alex Acero, Li Deng:
Unsupervised learning from users' error correction in speech dictation. INTERSPEECH 2004: 1969-1972 - 2003
- [c1]Dong Yu, Kuansan Wang, Milind Mahajan, Peter Mau, Alex Acero:
Improved name recognition with user modeling. INTERSPEECH 2003: 1229-1232
Coauthor Index
aka: Helen Meng
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-13 20:04 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint