default search action
Haizhou Li 0001
Person information
- unicode name: 李海洲
- affiliation: Chinese University of Hong Kong (Shenzhen), China
- affiliation: National University of Singapore, Department of Electrical and Computer Engineering, Singapore
- affiliation (2006 - 2016): Nanyang Technological University, Singapore
- affiliation (2003 - 2016): Institute for Infocomm Research, A*STAR, Singapore
- affiliation (2011): University of New South Wales, Sydney, Australia
- affiliation (2009): University of Eastern Finland, Kuopio, Finland
- affiliation (PhD 1990): South China University of Technology, Guangzhou, China
Other persons with the same name
- Haizhou Li 0002 — Blaise Pascal University, Clermont-Ferrand, France
- Haizhou Li 0003 — City University of Hong Kong, Department of Computer Science, Hong Kong
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j184]Qianhui Liu, Meng Ge, Haizhou Li:
Intelligent event-based lip reading word classification with spiking neural networks using spatio-temporal attention features and triplet loss. Inf. Sci. 675: 120660 (2024) - [j183]Jiaqi Yan, Qianhui Liu, Malu Zhang, Lang Feng, De Ma, Haizhou Li, Gang Pan:
Efficient spiking neural network design via neural architecture search. Neural Networks 173: 106172 (2024) - [j182]Xinyi Chen, Qu Yang, Jibin Wu, Haizhou Li, Kay Chen Tan:
A Hybrid Neural Coding Approach for Pattern Recognition With Spiking Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 3064-3078 (2024) - [j181]Shuai Wang, Zhengyang Chen, Bing Han, Hongji Wang, Chengdong Liang, Binbin Zhang, Xu Xiang, Wen Ding, Johan Rohdin, Anna Silnova, Yanmin Qian, Haizhou Li:
Advancing speaker embedding learning: Wespeaker toolkit for research and production. Speech Commun. 162: 103104 (2024) - [j180]Jingru Lin, Meng Ge, Wupeng Wang, Haizhou Li, Mengling Feng:
Selective HuBERT: Self-Supervised Pre-Training for Target Speaker in Clean and Mixture Speech. IEEE Signal Process. Lett. 31: 1014-1018 (2024) - [j179]Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
Text-Guided HuBERT: Self-Supervised Speech Pre-Training via Generative Adversarial Networks. IEEE Signal Process. Lett. 31: 2055-2059 (2024) - [j178]Xiaoxue Gao, Zexin Li, Yiming Chen, Cong Liu, Haizhou Li:
Transferable Adversarial Attacks Against ASR. IEEE Signal Process. Lett. 31: 2200-2204 (2024) - [j177]Qu Yang, Malu Zhang, Jibin Wu, Kay Chen Tan, Haizhou Li:
LC-TTFS: Toward Lossless Network Conversion for Spiking Neural Networks With TTFS Coding. IEEE Trans. Cogn. Dev. Syst. 16(5): 1626-1639 (2024) - [j176]Siqi Cai, Ran Zhang, Malu Zhang, Jibin Wu, Haizhou Li:
EEG-Based Auditory Attention Detection With Spiking Graph Convolutional Network. IEEE Trans. Cogn. Dev. Syst. 16(5): 1698-1706 (2024) - [j175]Koichiro Yoshino, Yun-Nung Chen, Paul A. Crook, Satwik Kottur, Jinchao Li, Behnam Hedayatnia, Seungwhan Moon, Zhengcong Fei, Zekang Li, Jinchao Zhang, Yang Feng, Jie Zhou, Seokhwan Kim, Yang Liu, Di Jin, Alexandros Papangelis, Karthik Gopalakrishnan, Dilek Hakkani-Tur, Babak Damavandi, Alborz Geramifard, Chiori Hori, Ankit Shah, Chen Zhang, Haizhou Li, João Sedoc, Luis F. D'Haro, Rafael E. Banchs, Alexander Rudnicky:
Overview of the Tenth Dialog System Technology Challenge: DSTC10. IEEE ACM Trans. Audio Speech Lang. Process. 32: 765-778 (2024) - [j174]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1559-1572 (2024) - [j173]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Accented Text-to-Speech Synthesis With Limited Data. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1699-1711 (2024) - [j172]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Controllable Accented Text-to-Speech Synthesis With Fine and Coarse-Grained Intensity Rendering. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2188-2201 (2024) - [j171]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2324-2337 (2024) - [j170]Congcong Sun, Hui Tian, Peng Tian, Haizhou Li, Zhenxing Qian:
Multi-Agent Deep Learning for the Detection of Multiple Speech Steganography Methods. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2957-2972 (2024) - [j169]Mingyang Zhang, Yi Zhou, Yi Ren, Chen Zhang, Xiang Yin, Haizhou Li:
RefXVC: Cross-Lingual Voice Conversion With Enhanced Reference Leveraging. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4146-4156 (2024) - [j168]Wupeng Wang, Zexu Pan, Xinke Li, Shuai Wang, Haizhou Li:
Speech Separation With Pretrained Frontend to Minimize Domain Mismatch. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4184-4198 (2024) - [j167]Zexu Pan, Marvin Borsdorf, Siqi Cai, Tanja Schultz, Haizhou Li:
NeuroHeed: Neuro-Steered Speaker Extraction Using EEG Signals. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4456-4470 (2024) - [j166]Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu:
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoders. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4569-4579 (2024) - [j165]Siqi Cai, Tanja Schultz, Haizhou Li:
Brain Topology Modeling With EEG-Graphs for Auditory Spatial Attention Detection. IEEE Trans. Biomed. Eng. 71(1): 171-182 (2024) - [j164]Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li:
Audio-Visual Temporal Forgery Detection Using Embedding-Level Fusion and Multi-Dimensional Contrastive Loss. IEEE Trans. Circuits Syst. Video Technol. 34(8): 6937-6948 (2024) - [j163]Zhenyu Weng, Huiping Zhuang, Fulin Luo, Haizhou Li, Zhiping Lin:
Few-Shot Contrastive Transfer Learning With Pretrained Model for Masked Face Verification. IEEE Trans. Multim. 26: 3871-3883 (2024) - [j162]Xinyuan Qian, Wei Xue, Qiquan Zhang, Ruijie Tao, Haizhou Li:
Deep Cross-Modal Retrieval Between Spatial Image and Acoustic Speech. IEEE Trans. Multim. 26: 4480-4489 (2024) - [j161]Ruihang Ji, Shuzhi Sam Ge, Kai Zhao, Haizhou Li:
Event-Triggered Tracking Control for Nonlinear Systems With Prescribed Performance. IEEE Trans. Syst. Man Cybern. Syst. 54(6): 3547-3557 (2024) - [c718]Shimin Zhang, Qu Yang, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan:
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling. AAAI 2024: 16838-16847 - [c717]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling. AAAI 2024: 18698-18706 - [c716]Jiadong Wang, Zexu Pan, Malu Zhang, Robby T. Tan, Haizhou Li:
Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition. AAAI 2024: 19144-19152 - [c715]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Malu Zhang, Haizhou Li:
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators. AAAI 2024: 19515-19524 - [c714]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. ACL (Findings) 2024: 1359-1375 - [c713]Feng Jiang, Weihao Liu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu, Haizhou Li:
Advancing Topic Segmentation and Outline Generation in Chinese Texts: The Paragraph-level Topic Representation, Corpus, and Benchmark. LREC/COLING 2024: 495-506 - [c712]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. LREC/COLING 2024: 4185-4197 - [c711]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study. LREC/COLING 2024: 16998-17010 - [c710]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. ICASSP 2024: 221-225 - [c709]Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. ICASSP 2024: 226-230 - [c708]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-Based Monaural Speech Enhancement. ICASSP 2024: 1001-1005 - [c707]Siqi Cai, Ran Zhang, Haizhou Li:
Robust Decoding of the Auditory Attention from EEG Recordings Through Graph Convolutional Networks. ICASSP 2024: 2320-2324 - [c706]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism. ICASSP 2024: 8696-8700 - [c705]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. ICASSP 2024: 10601-10605 - [c704]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech. ICASSP 2024: 10666-10670 - [c703]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition. ICASSP 2024: 10901-10905 - [c702]Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li:
Prompt-Driven Target Speech Diarization. ICASSP 2024: 11086-11090 - [c701]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio. ICASSP 2024: 11311-11315 - [c700]Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, Haizhou Li:
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization. IJCAI 2024: 3097-3105 - [c699]Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. IJCAI 2024: 3160-3168 - [c698]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. IJCNN 2024: 1-8 - [c697]Xianghu Yue, Xueyi Zhang, Yiming Chen, Chengwei Zhang, Mingrui Lao, Huiping Zhuang, Xinyuan Qian, Haizhou Li:
MMAL: Multi-Modal Analytic Learning for Exemplar-Free Audio-Visual Class Incremental Tasks. ACM Multimedia 2024: 2428-2437 - [c696]Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li:
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis. ACM Multimedia 2024: 3294-3302 - [c695]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Generative Expressive Conversational Speech Synthesis. ACM Multimedia 2024: 4187-4196 - [c694]Miao Liu, Jing Wang, Xinyuan Qian, Haizhou Li:
ListenFormer: Responsive Listening Head Generation with Non-autoregressive Transformers. ACM Multimedia 2024: 7094-7103 - [c693]Ruijie Tao, Zhan Shi, Yidi Jiang, Duc-Tuan Truong, Eng Siong Chng, Massimo Alioto, Haizhou Li:
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization. ACM Multimedia 2024: 11342-11347 - [c692]Chuang Li, Yan Zhang, Min-Yen Kan, Haizhou Li:
UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking. NAACL-HLT (Findings) 2024: 2972-2983 - [c691]Xidong Wang, Guiming Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Junying Chen, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li:
CMB: A Comprehensive Medical Benchmark in Chinese. NAACL-HLT 2024: 6184-6205 - [c690]Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Mosen Alharthi, Bang An, Juncai He, Ziche Liu, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu:
AceGPT, Localizing Large Language Models in Arabic. NAACL-HLT 2024: 8139-8163 - [c689]Kun Zhou, Berrak Sisman, Carlos Busso, Bin Ma, Haizhou Li:
Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion. Odyssey 2024: 180-186 - [c688]Ganjun Liu, Xiaohui Hou, Meng Ge, Tao Zhang, Haizhou Li:
A Non-Intrusive Approach to Assessing Dysarthria Severity: Advancing Clinical Diagnosis. WWW (Companion Volume) 2024: 1134-1137 - [i211]Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio. CoRR abs/2401.02626 (2024) - [i210]Feng Jiang, Kuang Wang, Haizhou Li:
Bridging Research and Readers: A Multi-Modal Automated Academic Papers Interpretation System. CoRR abs/2401.09150 (2024) - [i209]Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement. CoRR abs/2401.09686 (2024) - [i208]Xianghu Yue, Xiaohai Tian, Malu Zhang, Zhizheng Wu, Haizhou Li:
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing. CoRR abs/2401.12264 (2024) - [i207]Qianhui Liu, Jiaqi Yan, Malu Zhang, Gang Pan, Haizhou Li:
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization. CoRR abs/2401.14652 (2024) - [i206]Lei Liu, Li Liu, Haizhou Li:
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition. CoRR abs/2401.17604 (2024) - [i205]Wenjie Wei, Malu Zhang, Jilin Zhang, Ammar Belatreche, Jibin Wu, Zijing Xu, Xuerui Qiu, Hong Chen, Yang Yang, Haizhou Li:
Event-Driven Learning for Spiking Neural Networks. CoRR abs/2403.00270 (2024) - [i204]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Fine-Grained Quantitative Emotion Editing for Speech Generation. CoRR abs/2403.02002 (2024) - [i203]Xidong Wang, Nuo Chen, Junyin Chen, Yan Hu, Yidong Wang, Xiangbo Wu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang:
Apollo: An Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. CoRR abs/2403.03640 (2024) - [i202]Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. CoRR abs/2403.05772 (2024) - [i201]Danqing Luo, Chen Zhang, Yan Zhang, Haizhou Li:
CrossTune: Black-Box Few-Shot Classification with Label Enhancement. CoRR abs/2403.12468 (2024) - [i200]Wenxuan Wu, Xueyuan Chen, Xixin Wu, Haizhou Li, Helen Meng:
Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy. CoRR abs/2403.16078 (2024) - [i199]Yicheng Gu, Xueyao Zhang, Liumeng Xue, Haizhou Li, Zhizheng Wu:
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder. CoRR abs/2404.17161 (2024) - [i198]Ruijie Tao, Xinyuan Qian, Yidi Jiang, Junjie Li, Jiadong Wang, Haizhou Li:
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention. CoRR abs/2404.18501 (2024) - [i197]Chuang Li, Yang Deng, Hengchang Hu, Min-Yen Kan, Haizhou Li:
Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems. CoRR abs/2405.01868 (2024) - [i196]Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. CoRR abs/2405.09171 (2024) - [i195]Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps:
Mamba in Speech: Towards an Alternative to Self-Attention. CoRR abs/2405.12609 (2024) - [i194]Yiming Chen, Chen Zhang, Danqing Luo, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Unveiling the Achilles' Heel of NLG Evaluators: A Unified Adversarial Framework Driven by Large Language Models. CoRR abs/2405.14646 (2024) - [i193]Jiahui Xu, Feng Jiang, Anningzhe Gao, Haizhou Li:
Unsupervised Mutual Learning of Dialogue Discourse Parsing and Topic Segmentation. CoRR abs/2405.19799 (2024) - [i192]Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li:
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. CoRR abs/2405.20215 (2024) - [i191]Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li:
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio? CoRR abs/2406.02483 (2024) - [i190]Zhijun Liu, Shuai Wang, Sho Inoue, Qibing Bai, Haizhou Li:
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis. CoRR abs/2406.05551 (2024) - [i189]Yidi Jiang, Ruijie Tao, Zhengyang Chen, Yanmin Qian, Haizhou Li:
Target Speech Diarization with Multimodal Prompts. CoRR abs/2406.07198 (2024) - [i188]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhiwu Li, Haizhou Li:
Multi-Scale Accent Modeling with Disentangling for Multi-Speaker Multi-Accent TTS Synthesis. CoRR abs/2406.10844 (2024) - [i187]Zeyang Song, Qianhui Liu, Qu Yang, Yizhou Peng, Haizhou Li:
ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting. CoRR abs/2406.12726 (2024) - [i186]Junyi Ao, Yuancheng Wang, Xiaohai Tian, Dekun Chen, Jun Zhang, Lu Lu, Yuxuan Wang, Haizhou Li, Zhizheng Wu:
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words. CoRR abs/2406.13340 (2024) - [i185]Ziche Liu, Rui Ke, Feng Jiang, Haizhou Li:
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models. CoRR abs/2406.14115 (2024) - [i184]Jiabao Pan, Yan Zhang, Chen Zhang, Zuozhu Liu, Hongwei Wang, Haizhou Li:
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models. CoRR abs/2407.01009 (2024) - [i183]Rui Liu, Haolin Zuo, Zheng Lian, Xiaofen Xing, Björn W. Schuller, Haizhou Li:
Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset. CoRR abs/2407.02751 (2024) - [i182]Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. CoRR abs/2407.09521 (2024) - [i181]Weizhi Liu, Yue Li, Dongdong Lin, Hui Tian, Haizhou Li:
GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis. CoRR abs/2407.10471 (2024) - [i180]Shuai Wang, Zhengyang Chen, Kong Aik Lee, Yanmin Qian, Haizhou Li:
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning. CoRR abs/2407.15188 (2024) - [i179]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Generative Expressive Conversational Speech Synthesis. CoRR abs/2407.21491 (2024) - [i178]Qianhui Liu, Jiadong Wang, Yang Wang, Xin Yang, Gang Pan, Haizhou Li:
Human-Inspired Audio-Visual Speech Recognition: Spike Activity, Cueing Interaction and Causal Processing. CoRR abs/2408.16564 (2024) - [i177]Dashanka De Silva, Siqi Cai, Saurav Pahuja, Tanja Schultz, Haizhou Li:
NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention. CoRR abs/2409.02489 (2024) - [i176]Xinyuan Qian, Xianghu Yue, Jiadong Wang, Huiping Zhuang, Haizhou Li:
Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection. CoRR abs/2409.07224 (2024) - [i175]Zhijun Liu, Shuai Wang, Pengcheng Zhu, Mengxiao Bi, Haizhou Li:
E1 TTS: Simple and Fast Non-Autoregressive TTS. CoRR abs/2409.09351 (2024) - [i174]Sho Inoue, Shuai Wang, Wanxing Wang, Pengcheng Zhu, Mengxiao Bi, Haizhou Li:
MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion. CoRR abs/2409.09352 (2024) - [i173]Junjie Li, Ke Zhang, Shuai Wang, Haizhou Li, Man-Wai Mak, Kong Aik Lee:
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction. CoRR abs/2409.09589 (2024) - [i172]Chen Zhang, Dading Chong, Feng Jiang, Chengguang Tang, Anningzhe Gao, Guohua Tang, Haizhou Li:
Aligning Language Models Using Follow-up Likelihood as Reward Signal. CoRR abs/2409.13948 (2024) - [i171]Shuai Wang, Pengcheng Zhu, Haizhou Li:
M-Vec: Matryoshka Speaker Embeddings with Flexible Dimensions. CoRR abs/2409.15782 (2024) - [i170]Shuai Wang, Ke Zhang, Shaoxiong Lin, Junjie Li, Xuefei Wang, Meng Ge, Jianwei Yu, Yanmin Qian, Haizhou Li:
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction. CoRR abs/2409.15799 (2024) - [i169]Yiming Chen, Xianghu Yue, Xiaoxue Gao, Chen Zhang, Luis Fernando D'Haro, Robby T. Tan, Haizhou Li:
Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models. CoRR abs/2409.18680 (2024) - 2023
- [j160]Tao Luo, Weng-Fai Wong, Rick Siow Mong Goh, Anh Tuan Do, Zhixian Chen, Haizhou Li, Wenyu Jiang, Weiyun Yau:
Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing. Commun. ACM 66(7): 52-57 (2023) - [j159]Buddhi Wickramasinghe, Eliathamby Ambikairajah, Vidhyasaharan Sethu, Julien Epps, Haizhou Li, Ting Dang:
DNN controlled adaptive front-end for replay attack detection systems. Speech Commun. 154: 102973 (2023) - [j158]Tingting Wang, Zexu Pan, Meng Ge, Zhen Yang, Haizhou Li:
Time-Domain Speech Separation Networks With Graph Encoding Auxiliary. IEEE Signal Process. Lett. 30: 110-114 (2023) - [j157]Yi Zhou, Zhizheng Wu, Mingyang Zhang, Xiaohai Tian, Haizhou Li:
TTS-Guided Training for Accent Conversion Without Parallel Data. IEEE Signal Process. Lett. 30: 533-537 (2023) - [j156]Mingyang Zhang, Xuehao Zhou, Zhizheng Wu, Haizhou Li:
Towards Zero-Shot Multi-Speaker Multi-Accent Text-to-Speech Synthesis. IEEE Signal Process. Lett. 30: 947-951 (2023) - [j155]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Emotion Intensity and its Control for Emotional Voice Conversion. IEEE Trans. Affect. Comput. 14(1): 31-48 (2023) - [j154]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Speech Synthesis With Mixed Emotions. IEEE Trans. Affect. Comput. 14(4): 3120-3134 (2023) - [j153]Hui Tian, Yiqin Qiu, Wojciech Mazurczyk, Haizhou Li, Zhenxing Qian:
STFF-SM: Steganalysis Model Based on Spatial and Temporal Feature Fusion for Speech Streams. IEEE ACM Trans. Audio Speech Lang. Process. 31: 277-289 (2023) - [j152]Qiquan Zhang, Xinyuan Qian, Zhaoheng Ni, Aaron Nicolson, Eliathamby Ambikairajah, Haizhou Li:
A Time-Frequency Attention Module for Neural Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 31: 462-475 (2023) - [j151]Xinyuan Qian, Zhengdong Wang, Jiadong Wang, Guohui Guan, Haizhou Li:
Audio-Visual Cross-Attention Network for Robotic Speaker Tracking. IEEE ACM Trans. Audio Speech Lang. Process. 31: 550-562 (2023) - [j150]Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li:
PoE: A Panel of Experts for Generalized Automatic Dialogue Assessment. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1234-1250 (2023) - [j149]Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1706-1719 (2023) - [j148]Yi Zhou, Zhizheng Wu, Xiaohai Tian, Haizhou Li:
Optimization of Cross-Lingual Voice Conversion With Linguistics Losses to Reduce Foreign Accents. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1916-1926 (2023) - [j147]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
PoLyScriber: Integrated Fine-Tuning of Extractor and Lyrics Transcriber for Polyphonic Music. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1968-1981 (2023) - [j146]Zhenyu Weng, Huiping Zhuang, Haizhou Li, Balakrishnan Ramalingam, Rajesh Elara Mohan, Zhiping Lin:
Online Multi-Face Tracking With Multi-Modality Cascaded Matching. IEEE Trans. Circuits Syst. Video Technol. 33(6): 2738-2752 (2023) - [j145]Yiqin Qiu, Hui Tian, Haizhou Li, Chin-Chen Chang, Athanasios V. Vasilakos:
Separable Convolution Network With Dual-Stream Pyramid Enhanced Strategy for Speech Steganalysis. IEEE Trans. Inf. Forensics Secur. 18: 2737-2750 (2023) - [j144]Jibin Wu, Yansong Chua, Malu Zhang, Guoqi Li, Haizhou Li, Kay Chen Tan:
A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 34(1): 446-460 (2023) - [c687]Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby T. Tan, Haizhou Li:
Dynamic Transformers Provide a False Sense of Efficiency. ACL (1) 2023: 7164-7180 - [c686]Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Zero-shot multi-speaker accent TTS with limited accent data. APSIPA ASC 2023: 1931-1936 - [c685]Jiawei Du, Yidi Jiang, Vincent Y. F. Tan, Joey Tianyi Zhou, Haizhou Li:
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation. CVPR 2023: 3749-3758 - [c684]Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li:
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert. CVPR 2023: 14653-14662 - [c683]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. DADA@IJCAI 2023: 125-130 - [c682]Siqi Cai, Jia Li, Hongmeng Yang, Haizhou Li:
RGCnet: An Efficient Recursive Gated Convolutional Network for EEG-based Auditory Attention Detection. EMBC 2023: 1-4 - [c681]Chen Zhang, Luis F. D'Haro, Chengguang Tang, Ke Shi, Guohua Tang, Haizhou Li:
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark. EMNLP (Findings) 2023: 5579-5601 - [c680]Yan Zhang, Zhaopeng Feng, Zhiyang Teng, Zuozhu Liu, Haizhou Li:
How Well Do Text Embedding Models Understand Syntax? EMNLP (Findings) 2023: 9717-9728 - [c679]Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Guiming Chen, Jianquan Li, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li:
HuatuoGPT, Towards Taming Language Model to Be a Doctor. EMNLP (Findings) 2023: 10859-10885 - [c678]Marvin Borsdorf, Saurav Pahuja, Gabriel Ivucic, Siqi Cai, Haizhou Li, Tanja Schultz:
Multi-Head Attention and GRU for Improved Match-Mismatch Classification of Speech Stimulus and EEG Response. ICASSP 2023: 1-2 - [c677]Xiaoxue Gao, Xianghu Yue, Haizhou Li:
Self-Transriber: Few-Shot Lyrics Transcription With Self-Training. ICASSP 2023: 1-5 - [c676]Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting. ICASSP 2023: 1-5 - [c675]Ruijie Tao, Kong Aik Lee, Zhan Shi, Haizhou Li:
Speaker Recognition with Two-Step Multi-Modal Deep Cleansing. ICASSP 2023: 1-5 - [c674]Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
Token2vec: A Joint Self-Supervised Pre-Training Framework Using Unpaired Speech and Text. ICASSP 2023: 1-5 - [c673]Qiquan Zhang, Hongxu Zhu, Qi Song, Xinyuan Qian, Zhaoheng Ni, Haizhou Li:
Ripple Sparse Self-Attention for Monaural Speech Enhancement. ICASSP 2023: 1-5 - [c672]Haolin Zuo, Rui Liu, Jinming Zhao, Guanglai Gao, Haizhou Li:
Exploiting Modality-Invariant Feature for Robust Multimodal Emotion Recognition with Missing Modalities. ICASSP 2023: 1-5 - [c671]Yuke Si, Yan Zhang, Yuhang Li, Xiaobao Wang, Longbiao Wang, Jianwu Dang, Eng Siong Chng, Haizhou Li:
Local and Global Context Modeling with Relation Matching Task for Dialog Act Recognition. IJCNN 2023: 1-8 - [c670]Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou Li:
Explicit Intensity Control for Accented Text-to-speech. INTERSPEECH 2023: 22-26 - [c669]Ruicong Wang, Siqi Cai, Haizhou Li:
EEG-based Auditory Attention Detection with Spatiotemporal Graph and Graph Convolutional Network. INTERSPEECH 2023: 1144-1148 - [c668]Chutong Meng, Junyi Ao, Tom Ko, Mingxuan Wang, Haizhou Li:
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning. INTERSPEECH 2023: 2978-2982 - [c667]Jingru Lin, Xianghu Yue, Junyi Ao, Haizhou Li:
Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder. INTERSPEECH 2023: 2988-2992 - [c666]Yidi Jiang, Ruijie Tao, Zexu Pan, Haizhou Li:
Target Active Speaker Detection with Audio-visual Cues. INTERSPEECH 2023: 3152-3156 - [c665]Ke Zhang, Marvin Borsdorf, Zexu Pan, Haizhou Li, Yangjie Wei, Yi Wang:
Speaker Extraction with Detection of Presence and Absence of Target Speakers. INTERSPEECH 2023: 3714-3718 - [c664]Qinghua Liu, Meng Ge, Zhizheng Wu, Haizhou Li:
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network. INTERSPEECH 2023: 3719-3723 - [c663]Rui Liu, Jinhua Zhang, Guanglai Gao, Haizhou Li:
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion. INTERSPEECH 2023: 3999-4003 - [c662]Junchen Lu, Berrak Sisman, Mingyang Zhang, Haizhou Li:
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units. INTERSPEECH 2023: 5536-5540 - [c661]Chuang Li, Hengchang Hu, Yan Zhang, Min-Yen Kan, Haizhou Li:
A Conversation is Worth A Thousand Recommendations: A Survey of Holistic Conversational Recommendation Systems. KaRS@RecSys 2023: 7-20 - [c660]Xueyi Zhang, Chengwei Zhang, Tao Wang, Jun Tang, Songyang Lao, Haizhou Li:
Slow-Fast Time Parameter Aggregation Network for Class-Incremental Lip Reading. ACM Multimedia 2023: 747-756 - [c659]Saurav Pahuja, Siqi Cai, Tanja Schultz, Haizhou Li:
XAnet: Cross-Attention Between EEG of Left and Right Brain for Auditory Attention Decoding. NER 2023: 1-4 - [c658]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Disentangling Voice and Content with Self-Supervision for Speaker Recognition. NeurIPS 2023 - [c657]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
GrammarGPT: Exploring Open-Source LLMs for Native Chinese Grammatical Error Correction with Supervised Fine-Tuning. NLPCC (3) 2023: 69-80 - [c656]Bin Wang, Haizhou Li:
Relational Sentence Embedding for Flexible Semantic Matching. RepL4NLP@ACL 2023: 238-252 - [c655]Saurav Pahuja, Gabriel Ivucic, Felix Putze, Siqi Cai, Haizhou Li, Tanja Schultz:
Enhancing Subject-Independent EEG-Based Auditory Attention Decoding with WGAN and Pearson Correlation Coefficient. SMC 2023: 3715-3720 - [e24]Jianhua Tao, Haizhou Li, Jiangyan Yi, Cunhang Fan:
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, China, August 19, 2023. CEUR Workshop Proceedings 3597, CEUR-WS.org 2023 [contents] - [i168]Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li:
Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert. CoRR abs/2303.17480 (2023) - [i167]Zhihong Chen, Feng Jiang, Junying Chen, Tiannan Wang, Fei Yu, Guiming Chen, Hongbo Zhang, Juhao Liang, Chen Zhang, Zhiyi Zhang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li:
Phoenix: Democratizing ChatGPT across Languages. CoRR abs/2304.10453 (2023) - [i166]Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li:
Accented Text-to-Speech Synthesis with Limited Data. CoRR abs/2305.04816 (2023) - [i165]Qiquan Zhang, Hongxu Zhu, Qi Song, Xinyuan Qian, Zhaoheng Ni, Haizhou Li:
Ripple sparse self-attention for monaural speech enhancement. CoRR abs/2305.08541 (2023) - [i164]Yiming Chen, Simin Chen, Zexin Li, Wei Yang, Cong Liu, Robby T. Tan, Haizhou Li:
Dynamic Transformers Provide a False Sense of Efficiency. CoRR abs/2305.12228 (2023) - [i163]Yidi Jiang, Ruijie Tao, Zexu Pan, Haizhou Li:
Target Active Speaker Detection with Audio-visual Cues. CoRR abs/2305.12831 (2023) - [i162]Feng Jiang, Longwang He, Peifeng Li, Qiaoming Zhu, Haizhou Li:
Topic-driven Distant Supervision Framework for Macro-level Discourse Parsing. CoRR abs/2305.13755 (2023) - [i161]Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li:
ADD 2023: the Second Audio Deepfake Detection Challenge. CoRR abs/2305.13774 (2023) - [i160]Danqing Luo, Chen Zhang, Jiahui Xu, Bin Wang, Yiming Chen, Yan Zhang, Haizhou Li:
Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation. CoRR abs/2305.13785 (2023) - [i159]Feng Jiang, Weihao Liu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu, Haizhou Li:
Advancing Topic Segmentation and Outline Generation in Chinese Texts: The Paragraph-level Topic Representation, Corpus, and Benchmark. CoRR abs/2305.14790 (2023) - [i158]Hongbo Zhang, Junying Chen, Feng Jiang, Fei Yu, Zhihong Chen, Jianquan Li, Guiming Chen, Xiangbo Wu, Zhiyi Zhang, Qingying Xiao, Xiang Wan, Benyou Wang, Haizhou Li:
HuatuoGPT, towards Taming Language Model to Be a Doctor. CoRR abs/2305.15075 (2023) - [i157]Rui Liu, Jinhua Zhang, Guanglai Gao, Haizhou Li:
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion. CoRR abs/2305.16353 (2023) - [i156]Xinyi Chen, Qu Yang, Jibin Wu, Haizhou Li, Kay Chen Tan:
A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks. CoRR abs/2305.16594 (2023) - [i155]Zhenyu Weng, Huiping Zhuang, Haizhou Li, Zhiping Lin:
Constant Sequence Extension for Fast Search Using Weighted Hamming Distance. CoRR abs/2306.03612 (2023) - [i154]Junchen Lu, Berrak Sisman, Mingyang Zhang, Haizhou Li:
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units. CoRR abs/2306.17005 (2023) - [i153]Shimin Zhang, Qu Yang, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan:
Long Short-term Memory with Two-Compartment Spiking Neuron. CoRR abs/2307.07231 (2023) - [i152]Lingyi Yang, Feng Jiang, Haizhou Li:
Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect ChatGPT-Generated Text. CoRR abs/2307.11380 (2023) - [i151]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
GrammarGPT: Exploring Open-Source LLMs for Native Chinese Grammatical Error Correction with Supervised Fine-Tuning. CoRR abs/2307.13923 (2023) - [i150]Xidong Wang, Guiming Hardy Chen, Dingjie Song, Zhiyi Zhang, Zhihong Chen, Qingying Xiao, Feng Jiang, Jianquan Li, Xiang Wan, Benyou Wang, Haizhou Li:
CMB: A Comprehensive Medical Benchmark in Chinese. CoRR abs/2308.08833 (2023) - [i149]Shimin Zhang, Qu Yang, Chenxiang Ma, Jibin Wu, Haizhou Li, Kay Chen Tan:
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-term Sequential Modelling. CoRR abs/2308.13250 (2023) - [i148]Hongxu Zhu, Siqi Cai, Yidi Jiang, Qiquan Zhang, Haizhou Li:
EEG-Derived Voice Signature for Attended Speaker Detection. CoRR abs/2308.14774 (2023) - [i147]Qinghua Liu, Meng Ge, Zhizheng Wu, Haizhou Li:
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network. CoRR abs/2309.06723 (2023) - [i146]Chuang Li, Hengchang Hu, Yan Zhang, Min-Yen Kan, Haizhou Li:
A Conversation is Worth A Thousand Recommendations: A Survey of Holistic Conversational Recommender Systems. CoRR abs/2309.07682 (2023) - [i145]Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-talker Speech. CoRR abs/2309.08408 (2023) - [i144]Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks. CoRR abs/2309.09469 (2023) - [i143]Junyi Ao, Mehmet Sinan Yildirim, Meng Ge, Shuai Wang, Ruijie Tao, Yanmin Qian, Liqun Deng, Longshuai Xiao, Haizhou Li:
USED: Universal Speaker Extraction and Diarization. CoRR abs/2309.10674 (2023) - [i142]Rui Liu, Bin Liu, Haizhou Li:
Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech. CoRR abs/2309.11724 (2023) - [i141]Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li:
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency. CoRR abs/2309.11725 (2023) - [i140]Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition. CoRR abs/2309.11730 (2023) - [i139]Huang Huang, Fei Yu, Jianqing Zhu, Xuening Sun, Hao Cheng, Dingjie Song, Zhihong Chen, Abdulmohsen Alharthi, Bang An, Ziche Liu, Zhiyi Zhang, Junying Chen, Jianquan Li, Benyou Wang, Lian Zhang, Ruoyu Sun, Xiang Wan, Haizhou Li, Jinchao Xu:
AceGPT, Localizing Large Language Models in Arabic. CoRR abs/2309.12053 (2023) - [i138]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Disentangling Voice and Content with Self-Supervision for Speaker Recognition. CoRR abs/2310.01128 (2023) - [i137]Chen Zhang, Luis Fernando D'Haro, Chengguang Tang, Ke Shi, Guohua Tang, Haizhou Li:
xDial-Eval: A Multilingual Open-Domain Dialogue Evaluation Benchmark. CoRR abs/2310.08958 (2023) - [i136]Chuang Li, Yan Zhang, Min-Yen Kan, Haizhou Li:
UNO-DST: Leveraging Unlabelled Data in Zero-Shot Dialogue State Tracking. CoRR abs/2310.10492 (2023) - [i135]Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LocSelect: Target Speaker Localization with an Auditory Selective Hearing Mechanism. CoRR abs/2310.10497 (2023) - [i134]Yaxin Fan, Feng Jiang, Peifeng Li, Haizhou Li:
Quantify Health-Related Atomic Knowledge in Chinese Medical Large Language Models: A Computational Analysis. CoRR abs/2310.11722 (2023) - [i133]Qu Yang, Malu Zhang, Jibin Wu, Kay Chen Tan, Haizhou Li:
LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding. CoRR abs/2310.14978 (2023) - [i132]Yan Zhang, Zhaopeng Feng, Zhiyang Teng, Zuozhu Liu, Haizhou Li:
How Well Do Text Embedding Models Understand Syntax? CoRR abs/2311.07996 (2023) - [i131]Junying Chen, Xidong Wang, Anningzhe Gao, Feng Jiang, Shunian Chen, Hongbo Zhang, Dingjie Song, Wenya Xie, Chuyi Kong, Jianquan Li, Xiang Wan, Haizhou Li, Benyou Wang:
HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs. CoRR abs/2311.09774 (2023) - [i130]Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li:
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification. CoRR abs/2312.03620 (2023) - [i129]Xueyao Zhang, Liumeng Xue, Yuancheng Wang, Yicheng Gu, Xi Chen, Zihao Fang, Haopeng Chen, Lexiao Zou, Chaoren Wang, Jun Han, Kai Chen, Haizhou Li, Zhizheng Wu:
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit. CoRR abs/2312.09911 (2023) - [i128]Rui Liu, Yifan Hu, Yi Ren, Xiang Yin, Haizhou Li:
Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling. CoRR abs/2312.11947 (2023) - [i127]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Malu Zhang, Haizhou Li:
A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators. CoRR abs/2312.15407 (2023) - [i126]Meng Ge, Yizhou Peng, Yidi Jiang, Jingru Lin, Junyi Ao, Mehmet Sinan Yildirim, Shuai Wang, Haizhou Li, Mengling Feng:
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge. CoRR abs/2312.16002 (2023) - 2022
- [j143]Xianghu Yue, Jingru Lin, Fabian Ritter Gutierrez, Haizhou Li:
Self-Supervised Learning With Segmental Masking for Speech Representation. IEEE J. Sel. Top. Signal Process. 16(6): 1367-1379 (2022) - [j142]Hongqiang Du, Lei Xie, Haizhou Li:
Noise-robust voice conversion with domain adversarial training. Neural Networks 148: 74-84 (2022) - [j141]Jibin Wu, Chenglin Xu, Xiao Han, Daquan Zhou, Malu Zhang, Haizhou Li, Kay Chen Tan:
Progressive Tandem Learning for Pattern Recognition With Deep Spiking Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 44(11): 7824-7840 (2022) - [j140]Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li:
Emotional voice conversion: Theory, databases and ESD. Speech Commun. 137: 1-18 (2022) - [j139]Hongning Zhu, Kong Aik Lee, Haizhou Li:
Discriminative speaker embedding with serialized multi-layer multi-head attention. Speech Commun. 144: 89-100 (2022) - [j138]Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
Neural Acoustic-Phonetic Approach for Speaker Verification With Phonetic Attention Mask. IEEE Signal Process. Lett. 29: 782-786 (2022) - [j137]Zexu Pan, Xinyuan Qian, Haizhou Li:
Speaker Extraction With Co-Speech Gestures Cue. IEEE Signal Process. Lett. 29: 1467-1471 (2022) - [j136]Haizhou Li:
A Unique ICASSP 2022: During an Unusual Time [Conference Highlights]. IEEE Signal Process. Mag. 39(2): 159-160 (2022) - [j135]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Selective Listening by Synchronizing Speech With Lips. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1650-1664 (2022) - [j134]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Decoding Knowledge Transfer for Neural Text-to-Speech Training. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1789-1802 (2022) - [j133]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2280-2294 (2022) - [j132]Chitralekha Gupta, Haizhou Li, Masataka Goto:
Deep Learning Approaches in Topics of Singing Information Processing. IEEE ACM Trans. Audio Speech Lang. Process. 30: 2422-2451 (2022) - [j131]Zexu Pan, Meng Ge, Haizhou Li:
USEV: Universal Speaker Extraction With Visual Cue. IEEE ACM Trans. Audio Speech Lang. Process. 30: 3032-3045 (2022) - [j130]Enze Su, Siqi Cai, Longhan Xie, Haizhou Li, Tanja Schultz:
STAnet: A Spatiotemporal Attention Network for Decoding Auditory Spatial Attention From EEG. IEEE Trans. Biomed. Eng. 69(7): 2233-2242 (2022) - [j129]Siqi Cai, Enze Su, Longhan Xie, Haizhou Li:
EEG-Based Auditory Attention Detection via Frequency and Channel Neural Attention. IEEE Trans. Hum. Mach. Syst. 52(2): 256-266 (2022) - [j128]Malu Zhang, Jiadong Wang, Jibin Wu, Ammar Belatreche, Burin Amornpaisannon, Zhixuan Zhang, Venkata Pavan Kumar Miriyala, Hong Qu, Yansong Chua, Trevor E. Carlson, Haizhou Li:
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 33(5): 1947-1958 (2022) - [c654]Chen Zhang, Luis Fernando D'Haro, Thomas Friedrichs, Haizhou Li:
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation. AAAI 2022: 11657-11666 - [c653]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. ACL (1) 2022: 5699-5710 - [c652]Bin Wang, C.-C. Jay Kuo, Haizhou Li:
Just Rank: Rethinking Evaluation with Word and Sentence Similarities. ACL (1) 2022: 6060-6077 - [c651]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990 - [c650]Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li:
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation. EMNLP 2022: 3336-3355 - [c649]Bin Wang, Chen Zhang, Yan Zhang, Yiming Chen, Haizhou Li:
Analyzing and Evaluating Faithfulness in Dialogue Summarization. EMNLP 2022: 4897-4908 - [c648]Yiming Chen, Yan Zhang, Bin Wang, Zuozhu Liu, Haizhou Li:
Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework. EMNLP 2022: 8150-8161 - [c647]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music. ICASSP 2022: 791-795 - [c646]Marvin Borsdorf, Kevin Scheck, Haizhou Li, Tanja Schultz:
Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages. ICASSP 2022: 846-850 - [c645]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition. ICASSP 2022: 4703-4707 - [c644]Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Speaker Recognition with Loss-Gated Learning. ICASSP 2022: 6142-6146 - [c643]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
L-SpEx: Localized Target Speaker Extraction. ICASSP 2022: 7287-7291 - [c642]Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances. ICASSP 2022: 7517-7521 - [c641]Qiquan Zhang, Qi Song, Zhaoheng Ni, Aaron Nicolson, Haizhou Li:
Time-Frequency Attention for Monaural Speech Enhancement. ICASSP 2022: 7852-7856 - [c640]Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li:
Visualtts: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over. ICASSP 2022: 8032-8036 - [c639]Jiadong Wang, Jibin Wu, Malu Zhang, Qi Liu, Haizhou Li:
A Hybrid Learning Framework for Deep Spiking Neural Networks with One-Spike Temporal Coding. ICASSP 2022: 8942-8946 - [c638]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. ICASSP 2022: 9216-9220 - [c637]Marvin Borsdorf, Kevin Scheck, Haizhou Li, Tanja Schultz:
Blind Language Separation: Disentangling Multilingual Cocktail Party Voices by Language. INTERSPEECH 2022: 256-260 - [c636]Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li:
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT. INTERSPEECH 2022: 1686-1690 - [c635]Zexu Pan, Meng Ge, Haizhou Li:
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction. INTERSPEECH 2022: 1786-1790 - [c634]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion. INTERSPEECH 2022: 2603-2607 - [c633]Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data. INTERSPEECH 2022: 2658-2662 - [c632]Qu Yang, Qi Liu, Haizhou Li:
Deep residual spiking neural network for keyword spotting in low-resource settings. INTERSPEECH 2022: 3023-3027 - [c631]Zeyang Song, Qi Liu, Qu Yang, Haizhou Li:
Knowledge distillation for In-memory keyword spotting model. INTERSPEECH 2022: 4128-4132 - [c630]Rui Liu, Berrak Sisman, Björn W. Schuller, Guanglai Gao, Haizhou Li:
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning. INTERSPEECH 2022: 5493-5497 - [c629]Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. ACM Multimedia 2022: 7405-7406 - [c628]Qu Yang, Jibin Wu, Malu Zhang, Yansong Chua, Xinchao Wang, Haizhou Li:
Training Spiking Neural Networks with Local Tandem Learning. NeurIPS 2022 - [c627]Peiwen Li, Enze Su, Jia Li, Siqi Cai, Longhan Xie, Haizhou Li:
Esaa: An Eeg-Speech Auditory Attention Detection Database. O-COCOSDA 2022 2022: 1-6 - [e23]Rong Tong, Yanfeng Lu, Minghui Dong, Wengao Gong, Haizhou Li:
International Conference on Asian Language Processing, IALP 2022, Singapore, October 27-28, 2022. IEEE 2022, ISBN 978-1-6654-7674-4 [contents] - [e22]Svetlana Stoyanchev, Stefan Ultes, Haizhou Li:
Conversational AI for Natural Human-Centric Interaction - 12th International Workshop on Spoken Dialogue System Technology, IWSDS 2021, Singapore. Lecture Notes in Electrical Engineering 943, Springer 2022, ISBN 978-981-19-5537-2 [contents] - [e21]Jianhua Tao, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Lian, Pengyuan Zhang:
DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, Lisboa, Portugal, 14 October 2022. ACM 2022, ISBN 978-1-4503-9496-3 [contents] - [i125]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Emotion Intensity and its Control for Emotional Voice Conversion. CoRR abs/2201.03967 (2022) - [i124]Hongqiang Du, Lei Xie, Haizhou Li:
Noise-robust voice conversion with domain adversarial training. CoRR abs/2201.10693 (2022) - [i123]Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
MFA: TDNN with Multi-scale Frequency-channel Attention for Text-independent Speaker Verification with Short Utterances. CoRR abs/2202.01624 (2022) - [i122]Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu:
ADD 2022: the First Audio Deep Synthesis Detection Challenge. CoRR abs/2202.08433 (2022) - [i121]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
L-SpEx: Localized Target Speaker Extraction. CoRR abs/2202.09995 (2022) - [i120]Bin Wang, C.-C. Jay Kuo, Haizhou Li:
Just Rank: Rethinking Evaluation with Word and Sentence Similarities. CoRR abs/2203.02679 (2022) - [i119]Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li:
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT. CoRR abs/2203.15610 (2022) - [i118]Zexu Pan, Xinyuan Qian, Haizhou Li:
Speaker Extraction with Co-Speech Gestures Cue. CoRR abs/2203.16840 (2022) - [i117]Zexu Pan, Meng Ge, Haizhou Li:
A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction. CoRR abs/2203.16843 (2022) - [i116]Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei:
Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data. CoRR abs/2203.17113 (2022) - [i115]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music. CoRR abs/2204.03307 (2022) - [i114]Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao Wang, Haizhou Li:
M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database. CoRR abs/2205.10237 (2022) - [i113]Rui Liu, Berrak Sisman, Björn W. Schuller, Guanglai Gao, Haizhou Li:
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning. CoRR abs/2206.07229 (2022) - [i112]Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
PoLyScribers: Joint Training of Vocal Extractor and Lyrics Transcriber for Polyphonic Music. CoRR abs/2207.07336 (2022) - [i111]Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li:
Speech Synthesis with Mixed Emotions. CoRR abs/2208.05890 (2022) - [i110]Jiadong Wang, Xinyuan Qian, Haizhou Li:
Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception. CoRR abs/2209.01768 (2022) - [i109]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Controllable Accented Text-to-Speech Synthesis. CoRR abs/2209.10804 (2022) - [i108]Qutang Cai, Guoqiang Hong, Zhijian Ye, Ximin Li, Haizhou Li:
The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022. CoRR abs/2209.11433 (2022) - [i107]Bin Wang, Chen Zhang, Chengwei Wei, Haizhou Li:
A Focused Study on Sequence Length for Dialogue Summarization. CoRR abs/2209.11910 (2022) - [i106]Chutong Meng, Junyi Ao, Tom Ko, Mingxuan Wang, Haizhou Li:
CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning. CoRR abs/2210.04062 (2022) - [i105]Qu Yang, Jibin Wu, Malu Zhang, Yansong Chua, Xinchao Wang, Haizhou Li:
Training Spiking Neural Networks with Local Tandem Learning. CoRR abs/2210.04532 (2022) - [i104]Bin Wang, Chen Zhang, Yan Zhang, Yiming Chen, Haizhou Li:
Analyzing and Evaluating Faithfulness in Dialogue Summarization. CoRR abs/2210.11777 (2022) - [i103]Kun Zhou, Berrak Sisman, Carlos Busso, Haizhou Li:
Mixed Emotion Modelling for Emotional Voice Conversion. CoRR abs/2210.13756 (2022) - [i102]Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li:
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation. CoRR abs/2210.13832 (2022) - [i101]Haolin Zuo, Rui Liu, Jinming Zhao, Guanglai Gao, Haizhou Li:
Exploiting modality-invariant feature for robust multimodal emotion recognition with missing modalities. CoRR abs/2210.15359 (2022) - [i100]Yifan Hu, Rui Liu, Guanglai Gao, Haizhou Li:
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis. CoRR abs/2210.15360 (2022) - [i99]Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou Li:
Explicit Intensity Control for Accented Text-to-speech. CoRR abs/2210.15364 (2022) - [i98]Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Training of Speaker Encoder with Multi-Modal Diverse Positive Pairs. CoRR abs/2210.15385 (2022) - [i97]Ruijie Tao, Kong Aik Lee, Zhan Shi, Haizhou Li:
Speaker recognition with two-step multi-modal deep cleansing. CoRR abs/2210.15903 (2022) - [i96]Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text. CoRR abs/2210.16755 (2022) - [i95]Yiming Chen, Yan Zhang, Bin Wang, Zuozhu Liu, Haizhou Li:
Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework. CoRR abs/2210.16798 (2022) - [i94]Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting. CoRR abs/2211.00109 (2022) - [i93]Kong Aik Lee, Tomi Kinnunen, Daniele Colibro, Claudio Vair, Andreas Nautsch, Hanwu Sun, Liang He, Tianyu Liang, Qiongqiong Wang, Mickael Rouvier, Pierre-Michel Bousquet, Rohan Kumar Das, Ignacio Viñals Bailo, Meng Liu, Héctor Deldago, Xuechen Liu, Md. Sahidullah, Sandro Cumani, Boning Zhang, Koji Okabe, Hitoshi Yamamoto, Ruijie Tao, Haizhou Li, Alfonso Ortega Giménez, Longbiao Wang, Luis Buera:
I4U System Description for NIST SRE'20 CTS Challenge. CoRR abs/2211.01091 (2022) - [i92]Xiaoxue Gao, Xianghu Yue, Haizhou Li:
Self-Transriber: Few-shot Lyrics Transcription with Self-training. CoRR abs/2211.10152 (2022) - [i91]Jiawei Du, Yidi Jiang, Vincent Y. F. Tan, Joey Tianyi Zhou, Haizhou Li:
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation. CoRR abs/2211.11004 (2022) - [i90]Bin Wang, Haizhou Li:
Relational Sentence Embedding for Flexible Semantic Matching. CoRR abs/2212.08802 (2022) - [i89]Chen Zhang, Luis Fernando D'Haro, Qiquan Zhang, Thomas Friedrichs, Haizhou Li:
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment. CoRR abs/2212.08992 (2022) - 2021
- [j127]Jibin Wu, Qi Liu, Malu Zhang, Zihan Pan, Haizhou Li, Kay Chen Tan:
HuRAI: A brain-inspired computational model for human-robot auditory interface. Neurocomputing 465: 103-113 (2021) - [j126]Rui Liu, Berrak Sisman, Yixing Lin, Haizhou Li:
FastTalker: A neural text-to-speech architecture with shallow and group autoregression. Neural Networks 141: 306-314 (2021) - [j125]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
Factorized WaveNet for voice conversion with limited data. Speech Commun. 130: 45-54 (2021) - [j124]Tharshini Gunendradasan, Eliathamby Ambikairajah, Julien Epps, Vidhyasaharan Sethu, Haizhou Li:
An adaptive transmission line cochlear model based front-end for replay attack detection. Speech Commun. 132: 114-122 (2021) - [j123]Bidisha Sharma, Xiaoxue Gao, Karthika Vijayan, Xiaohai Tian, Haizhou Li:
NHSS: A speech and singing parallel database. Speech Commun. 133: 9-22 (2021) - [j122]Xinyuan Qian, Qi Liu, Jiadong Wang, Haizhou Li:
Three-Dimensional Speaker Localization: Audio-Refined Visual Scaling Factor Estimation. IEEE Signal Process. Lett. 28: 1405-1409 (2021) - [j121]Berrak Sisman, Junichi Yamagishi, Simon King, Haizhou Li:
An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning. IEEE ACM Trans. Audio Speech Lang. Process. 29: 132-157 (2021) - [j120]Rui Liu, Berrak Sisman, Feilong Bao, Jichen Yang, Guanglai Gao, Haizhou Li:
Exploiting Morphological and Phonological Features to Improve Prosodic Phrasing for Mongolian Speech Synthesis. IEEE ACM Trans. Audio Speech Lang. Process. 29: 274-285 (2021) - [j119]Mingyang Zhang, Yi Zhou, Li Zhao, Haizhou Li:
Transfer Learning From Speech Synthesis to Voice Conversion With Non-Parallel Training Data. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1290-1302 (2021) - [j118]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Expressive TTS Training With Frame and Style Reconstruction Loss. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1806-1818 (2021) - [j117]Chen Zhang, Grandee Lee, Luis Fernando D'Haro, Haizhou Li:
D-Score: Holistic Dialogue Evaluation Without Reference. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2502-2516 (2021) - [j116]Zihan Pan, Malu Zhang, Jibin Wu, Jiadong Wang, Haizhou Li:
Multi-Tone Phase Coding of Interaural Time Difference for Sound Source Localization With Spiking Neural Networks. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2656-2670 (2021) - [j115]Chenglin Xu, Wei Rao, Jibin Wu, Haizhou Li:
Target Speaker Verification With Selective Auditory Attention for Single and Multi-Talker Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2696-2709 (2021) - [j114]Yi Zhou, Xiaohai Tian, Haizhou Li:
Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3427-3439 (2021) - [c626]Yan Zhang, Ruidan He, Zuozhu Liu, Lidong Bing, Haizhou Li:
Bootstrapped Unsupervised Sentence Representation Learning. ACL/IJCNLP (1) 2021: 5168-5180 - [c625]Chen Zhang, Yiming Chen, Luis Fernando D'Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee, Haizhou Li:
DynaEval: Unifying Turn and Dialogue Level Evaluation. ACL/IJCNLP (1) 2021: 5676-5689 - [c624]Jinhu Li, Chitralekha Gupta, Haizhou Li:
Training Explainable Singing Quality Assessment Network with Augmented Data. APSIPA ASC 2021: 904-911 - [c623]Chitralekha Gupta, Jinhu Li, Haizhou Li:
Towards Reference-Independent Rhythm Assessment of Solo Singing. APSIPA ASC 2021: 912-919 - [c622]Yi Ma, Kong Aik Lee, Ville Hautamäki, Haizhou Li:
PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction. ASRU 2021: 106-113 - [c621]Bidisha Sharma, Maulik C. Madhavi, Xuehao Zhou, Haizhou Li:
Exploring Teacher-Student Learning Approach for Multi-Lingual Speech-to-Intent Classification. ASRU 2021: 419-426 - [c620]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer. ASRU 2021: 594-601 - [c619]Sergey Nikonorov, Berrak Sisman, Mingyang Zhang, Haizhou Li:
DEEPA: A Deep Neural Analyzer for Speech and Singing Vocoding. ASRU 2021: 618-625 - [c618]Marvin Borsdorf, Haizhou Li, Tanja Schultz:
Target Language Extraction at Multilingual Cocktail Parties. ASRU 2021: 717-724 - [c617]Mingyang Zhang, Xuehao Zhou, Kun Zhou, Rui Liu, Perry Lam, Berrak Sisman, Haizhou Li:
SUTD-NUS System for Blizzard Challenge 2021. Blizzard Challenge 2021 - [c616]Enze Su, Siqi Cai, Peiwen Li, Longhan Xie, Haizhou Li:
Auditory Attention Detection with EEG Channel Attention. EMBC 2021: 5804-5807 - [c615]Siqi Cai, Pengcheng Sun, Tanja Schultz, Haizhou Li:
Low-Latency Auditory Spatial Attention Detection Based on Spectro-Spatial Features from EEG. EMBC 2021: 5812-5815 - [c614]Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, Haizhou Li:
Revisiting Self-training for Few-shot Learning of Language Model. EMNLP (1) 2021: 9125-9135 - [c613]Nana Hou, Chenglin Xu, Eng Siong Chng, Haizhou Li:
Learning Disentangled Feature Representations for Speech Enhancement Via Adversarial Training. ICASSP 2021: 666-670 - [c612]Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li:
Seen and Unseen Emotional Style Transfer for Voice Conversion with A New Emotional Speech Dataset. ICASSP 2021: 920-924 - [c611]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-Target DoA Estimation with an Audio-Visual Fusion Mechanism. ICASSP 2021: 4280-4284 - [c610]Rui Liu, Berrak Sisman, Haizhou Li:
Graphspeech: Syntax-Aware Graph Attention Network for Neural Speech Synthesis. ICASSP 2021: 6059-6063 - [c609]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
Multi-Stage Speaker Extraction with Utterance and Frame-Level Reference Signals. ICASSP 2021: 6109-6113 - [c608]Lili Guo, Longbiao Wang, Chenglin Xu, Jianwu Dang, Eng Siong Chng, Haizhou Li:
Representation Learning with Spectro-Temporal-Channel Attention for Speech Emotion Recognition. ICASSP 2021: 6304-6308 - [c607]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Data Augmentation with Signal Companding for Detection of Logical Access Attacks. ICASSP 2021: 6349-6353 - [c606]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Muse: Multi-Modal Target Speaker Extraction with Visual Cues. ICASSP 2021: 6678-6682 - [c605]Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Leveraging Acoustic and Linguistic Embeddings from Pretrained Speech and Language Models for Intent Classification. ICASSP 2021: 7498-7502 - [c604]Qicong Xie, Xiaohai Tian, Guanghou Liu, Kun Song, Lei Xie, Zhiyong Wu, Hai Li, Song Shi, Haizhou Li, Fen Hong, Hui Bu, Xin Xu:
The Multi-Speaker Multi-Style Voice Cloning Challenge 2021. ICASSP 2021: 8613-8617 - [c603]Huiping Zhuang, Zhenyu Weng, Fulin Luo, Kar-Ann Toj, Haizhou Li, Zhiping Lin:
Accumulated Decoupled Learning with Gradient Staleness Mitigation for Convolutional Neural Networks. ICML 2021: 12935-12944 - [c602]Jiadong Wang, Xinyuan Qian, Zihan Pan, Malu Zhang, Haizhou Li:
GCC-PHAT with Speech-oriented Attention for Robotic Sound Source Localization. ICRA 2021: 5876-5883 - [c601]Qu Yang, Jibin Wu, Haizhou Li:
Rethinking Benchmarks for Neuromorphic Learning Algorithms. IJCNN 2021: 1-8 - [c600]Hongning Zhu, Kong Aik Lee, Haizhou Li:
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding. Interspeech 2021: 106-110 - [c599]Qiquan Zhang, Qi Song, Aaron Nicolson, Tian Lan, Haizhou Li:
Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement. Interspeech 2021: 166-170 - [c598]Xianghu Yue, Haizhou Li:
Phonetically Motivated Self-Supervised Speech Representation Learning. Interspeech 2021: 746-750 - [c597]Kun Zhou, Berrak Sisman, Haizhou Li:
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training. Interspeech 2021: 811-815 - [c596]Rohan Kumar Das, Maulik C. Madhavi, Haizhou Li:
Diagnosis of COVID-19 Using Auditory Acoustic Cues. Interspeech 2021: 921-925 - [c595]Li Zhang, Qing Wang, Kong Aik Lee, Lei Xie, Haizhou Li:
Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification. Interspeech 2021: 1094-1098 - [c594]Yi Zhou, Xiaohai Tian, Zhizheng Wu, Haizhou Li:
Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation. Interspeech 2021: 1374-1378 - [c593]Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers. Interspeech 2021: 1469-1473 - [c592]Wupeng Wang, Chenglin Xu, Meng Ge, Haizhou Li:
Neural Speaker Extraction with Speaker-Speech Cross-Attention Network. Interspeech 2021: 3535-3539 - [c591]Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
GlobalPhone Mix-To-Separate Out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation. Interspeech 2021: 3905-3909 - [c590]Rui Liu, Berrak Sisman, Haizhou Li:
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability. Interspeech 2021: 4648-4652 - [c589]Yidi Jiang, Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification. Interspeech 2021: 4713-4717 - [c588]Meidan Ouyang, Rohan Kumar Das, Jichen Yang, Haizhou Li:
Capsule Network based End-to-end System for Detection of Replay Attacks. ISCSLP 2021: 1-5 - [c587]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Thomas Friedrichs, Haizhou Li:
Investigating the Impact of Pre-trained Language Models on Dialog Evaluation. IWSDS 2021: 291-306 - [c586]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. ACM Multimedia 2021: 3927-3935 - [c585]Xinyuan Qian, Bidisha Sharma, Amine El Abridi, Haizhou Li:
SLoClas: A Database for Joint Sound Localization and Classification. O-COCOSDA 2021: 128-133 - [c584]Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li:
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. SIGDIAL 2021 - [c583]Kun Zhou, Berrak Sisman, Haizhou Li:
Vaw-Gan For Disentanglement And Recomposition Of Emotional Elements In Speech. SLT 2021: 415-422 - [c582]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
Optimizing Voice Conversion Network with Cycle Consistency Loss of Speaker Identity. SLT 2021: 507-513 - [e20]Deyi Xiong, Ridong Jiang, Yanfeng Lu, Minghui Dong, Haizhou Li:
International Conference on Asian Language Processing, IALP 2021, Singapore, December 11-13, 2021. IEEE 2021, ISBN 978-1-6654-8311-7 [contents] - [e19]Erik Marchi, Sabato Marco Siniscalchi, Sandro Cumani, Valerio Mario Salerno, Haizhou Li:
Increasing Naturalness and Flexibility in Spoken Dialogue Interaction - 10th International Workshop on Spoken Dialogue Systems, IWSDS 2019, Syracuse, Sicily, Italy, 24-26 April 2019. Lecture Notes in Electrical Engineering 714, Springer 2021, ISBN 978-981-15-9322-2 [contents] - [e18]Haizhou Li, Gina-Anne Levow, Zhou Yu, Chitralekha Gupta, Berrak Sisman, Siqi Cai, David Vandyke, Nina Dethlefs, Yan Wu, Junyi Jessy Li:
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGdial 2021, Singapore and Online, July 29-31, 2021. Association for Computational Linguistics 2021, ISBN 978-1-954085-81-7 [contents] - [e17]Haizhou Li, Shuzhi Sam Ge, Yan Wu, Agnieszka Wykowska, Hongsheng He, Xiaorui Liu, Dongyu Li, Jairo Pérez-Osorio:
Social Robotics - 13th International Conference, ICSR 2021, Singapore, November 10-13, 2021, Proceedings. Lecture Notes in Computer Science 13086, Springer 2021, ISBN 978-3-030-90524-8 [contents] - [i88]Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Leveraging Acoustic and Linguistic Embeddings from Pretrained speech and language Models for Intent Classification. CoRR abs/2102.07370 (2021) - [i87]Siqi Cai, Pengcheng Sun, Tanja Schultz, Haizhou Li:
Low-latency auditory spatial attention detection based on spectro-spatial features from EEG. CoRR abs/2103.03621 (2021) - [i86]Chenglin Xu, Wei Rao, Jibin Wu, Haizhou Li:
Target Speaker Verification with Selective Auditory Attention for Single and Multi-talker Speech. CoRR abs/2103.16269 (2021) - [i85]Kun Zhou, Berrak Sisman, Haizhou Li:
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training. CoRR abs/2103.16809 (2021) - [i84]Rui Liu, Berrak Sisman, Haizhou Li:
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability. CoRR abs/2104.01408 (2021) - [i83]Xinyuan Qian, Maulik C. Madhavi, Zexu Pan, Jiadong Wang, Haizhou Li:
Multi-target DoA Estimation with an Audio-visual Fusion Mechanism. CoRR abs/2105.06107 (2021) - [i82]Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li:
Emotional Voice Conversion: Theory, Databases and ESD. CoRR abs/2105.14762 (2021) - [i81]Chen Zhang, Yiming Chen, Luis Fernando D'Haro, Yan Zhang, Thomas Friedrichs, Grandee Lee, Haizhou Li:
DynaEval: Unifying Turn and Dialogue Level Evaluation. CoRR abs/2106.01112 (2021) - [i80]Li Zhang, Qing Wang, Kong Aik Lee, Lei Xie, Haizhou Li:
Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification. CoRR abs/2106.09320 (2021) - [i79]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer. CoRR abs/2107.03748 (2021) - [i78]Hongning Zhu, Kong Aik Lee, Haizhou Li:
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding. CoRR abs/2107.06493 (2021) - [i77]Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. CoRR abs/2107.06592 (2021) - [i76]Xinyuan Qian, Bidisha Sharma, Amine El Abridi, Haizhou Li:
SLoClas: A Database for Joint Sound Localization and Classification. CoRR abs/2108.02539 (2021) - [i75]Yidi Jiang, Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification. CoRR abs/2108.02598 (2021) - [i74]Bidisha Sharma, Maulik C. Madhavi, Xuehao Zhou, Haizhou Li:
Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification. CoRR abs/2109.13486 (2021) - [i73]Zexu Pan, Meng Ge, Haizhou Li:
USEV: Universal Speaker Extraction with Visual Cue. CoRR abs/2109.14831 (2021) - [i72]Yi Ma, Kong Aik Lee, Ville Hautamäki, Haizhou Li:
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction. CoRR abs/2110.00940 (2021) - [i71]Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, Haizhou Li:
Revisiting Self-Training for Few-Shot Learning of Language Model. CoRR abs/2110.01256 (2021) - [i70]Chen Zhang, Luis Fernando D'Haro, Yiming Chen, Thomas Friedrichs, Haizhou Li:
Investigating the Impact of Pre-trained Language Models on Dialog Evaluation. CoRR abs/2110.01895 (2021) - [i69]Rui Liu, Berrak Sisman, Haizhou Li:
StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis. CoRR abs/2110.03156 (2021) - [i68]Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li:
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over. CoRR abs/2110.03342 (2021) - [i67]Sergey Nikonorov, Berrak Sisman, Mingyang Zhang, Haizhou Li:
DeepA: A Deep Neural Analyzer For Speech And Singing Vocoding. CoRR abs/2110.06434 (2021) - [i66]Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021) - [i65]Zongyang Du, Berrak Sisman, Kun Zhou, Haizhou Li:
Identity Conversion for Emotional Speakers: A Study for Disentanglement of Emotion Style and Speaker Identity. CoRR abs/2110.10326 (2021) - [i64]Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition. CoRR abs/2111.00865 (2021) - [i63]Qiquan Zhang, Qi Song, Zhaoheng Ni, Aaron Nicolson, Haizhou Li:
Time-Frequency Attention for Monaural Speech Enhancement. CoRR abs/2111.07518 (2021) - [i62]Chen Zhang, Luis Fernando D'Haro, Thomas Friedrichs, Haizhou Li:
MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation. CoRR abs/2112.07194 (2021) - 2020
- [j113]Kong Aik Lee, Seyed Omid Sadjadi, Haizhou Li, Douglas A. Reynolds:
Two decades into Speaker Recognition Evaluation - are we there yet? Comput. Speech Lang. 61: 101058 (2020) - [j112]Malu Zhang, Jibin Wu, Ammar Belatreche, Zihan Pan, Xiurui Xie, Yansong Chua, Guoqi Li, Hong Qu, Haizhou Li:
Supervised learning in spiking neural networks with synaptic delay-weight plasticity. Neurocomputing 409: 103-118 (2020) - [j111]Malu Zhang, Xiaoling Luo, Yi Chen, Jibin Wu, Ammar Belatreche, Zihan Pan, Hong Qu, Haizhou Li:
An Efficient Threshold-Driven Aggregate-Label Learning Algorithm for Multimodal Information Processing. IEEE J. Sel. Top. Signal Process. 14(3): 592-602 (2020) - [j110]Mingyang Zhang, Berrak Sisman, Li Zhao, Haizhou Li:
DeepConversion: Voice conversion with limited parallel training data. Speech Commun. 122: 31-43 (2020) - [j109]Yi Zhou, Xiaohai Tian, Haizhou Li:
Multi-Task WaveRNN With an Integrated Architecture for Cross-Lingual Voice Conversion. IEEE Signal Process. Lett. 27: 1310-1314 (2020) - [j108]Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
Modeling Prosodic Phrasing With Multi-Task Learning in Tacotron-Based TTS. IEEE Signal Process. Lett. 27: 1470-1474 (2020) - [j107]Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Leaderboard: Evaluation of Singing Quality Without a Standard Reference. IEEE ACM Trans. Audio Speech Lang. Process. 28: 13-26 (2020) - [j106]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
SpEx: Multi-Scale Time Domain Speaker Extraction Network. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1370-1384 (2020) - [j105]Jichen Yang, Rohan Kumar Das, Haizhou Li:
Significance of Subband Features for Synthetic Speech Detection. IEEE Trans. Inf. Forensics Secur. 15: 2160-2170 (2020) - [c581]Grandee Lee, Haizhou Li:
Modeling Code-Switch Languages Using Bilingual Parallel Corpus. ACL 2020: 860-870 - [c580]Lin Huang, Chitralekha Gupta, Haizhou Li:
Spectral Features and Pitch Histogram for Automatic Singing Quality Evaluation with CRNN. APSIPA 2020: 492-499 - [c579]Zongyang Du, Kun Zhou, Berrak Sisman, Haizhou Li:
Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN. APSIPA 2020: 507-513 - [c578]Junchen Lu, Kun Zhou, Berrak Sisman, Haizhou Li:
VAW-GAN for Singing Voice Conversion with Non-parallel Training Data. APSIPA 2020: 514-519 - [c577]Rohan Kumar Das, Ruijie Tao, Jichen Yang, Wei Rao, Cheng Yu, Haizhou Li:
HLT-NUS Submission for 2019 NIST Multimedia Speaker Recognition Evaluation. APSIPA 2020: 605-609 - [c576]Rohan Kumar Das, Haizhou Li:
Classification of Speech with and without Face Mask using Acoustic Features. APSIPA 2020: 747-752 - [c575]Yi Zhou, Xiaohai Tian, Xuehao Zhou, Mingyang Zhang, Grandee Lee, Riu Liu, Berrak Sisman, Haizhou Li:
NUS-HLT System for Blizzard Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c574]Xiaohai Tian, Zhichao Wang, Shan Yang, Xinyong Zhou, Hongqiang Du, Yi Zhou, Mingyang Zhang, Kun Zhou, Berrak Sisman, Lei Xie, Haizhou Li:
The NUS & NWPU system for Voice Conversion Challenge 2020. Blizzard Challenge / Voice Conversion Challenge 2020 - [c573]Wanqiu Lin, Maulik C. Madhavi, Rohan Kumar Das, Haizhou Li:
Transformer-based Arabic Dialect Identification. IALP 2020: 192-196 - [c572]Zhenyu Weng, Yuesheng Zhu, Zhiping Lin, Haizhou Li:
Real-Time Multiple Object Tracking with Discriminative Features. ICARCV 2020: 309-314 - [c571]Xinggan Peng, Huiping Zhuang, Guang-Bin Huang, Haizhou Li, Zhiping Lin:
Robust Real-time Face Tracking for People Wearing Face Masks. ICARCV 2020: 779-783 - [c570]Chitralekha Gupta, Emre Yilmaz, Haizhou Li:
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help? ICASSP 2020: 496-500 - [c569]Xiang Hao, Chenglin Xu, Nana Hou, Lei Xie, Eng Siong Chng, Haizhou Li:
Time-Domain Neural Network Approach for Speech Bandwidth Extension. ICASSP 2020: 866-870 - [c568]Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao, Haizhou Li:
Teacher-Student Training For Robust Tacotron-Based TTS. ICASSP 2020: 6274-6278 - [c567]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Assessing the Scope of Generalized Countermeasures for Anti-Spoofing. ICASSP 2020: 6589-6593 - [c566]Van Tung Pham, Haihua Xu, Yerbolat Khassanov, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma, Haizhou Li:
Independent Language Modeling Architecture for End-To-End ASR. ICASSP 2020: 7059-7063 - [c565]Rohan Kumar Das, Haizhou Li:
On the Importance of Vocal Tract Constriction for Speaker Characterization: The Whispered Speech Study. ICASSP 2020: 7119-7123 - [c564]Xuehao Zhou, Xiaohai Tian, Grandee Lee, Rohan Kumar Das, Haizhou Li:
End-to-End Code-Switching TTS with Cross-Lingual Language Model. ICASSP 2020: 7614-7618 - [c563]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
Effective Wavenet Adaptation for Voice Conversion with Limited Data. ICASSP 2020: 7779-7783 - [c562]Zexu Pan, Zhaojie Luo, Jichen Yang, Haizhou Li:
Multi-Modal Attention for Speech Emotion Recognition. INTERSPEECH 2020: 364-368 - [c561]Xinyuan Zhou, Emre Yilmaz, Yanhua Long, Yijie Li, Haizhou Li:
Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition. INTERSPEECH 2020: 1042-1046 - [c560]Zhenzong Wu, Rohan Kumar Das, Jichen Yang, Haizhou Li:
Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks. INTERSPEECH 2020: 1101-1105 - [c559]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
SpEx+: A Complete Time Domain Speaker Extraction Network. INTERSPEECH 2020: 1406-1410 - [c558]Ruijie Tao, Rohan Kumar Das, Haizhou Li:
Audio-Visual Speaker Recognition with a Cross-Modal Discriminative Network. INTERSPEECH 2020: 2242-2246 - [c557]Emre Yilmaz, Özgür Bora Gevrek, Jibin Wu, Yuxiang Chen, Xuanbo Meng, Haizhou Li:
Deep Convolutional Spiking Neural Networks for Keyword Spotting. INTERSPEECH 2020: 2557-2561 - [c556]Siqi Cai, Enze Su, Yonghao Song, Longhan Xie, Haizhou Li:
Low Latency Auditory Attention Detection with Common Spatial Pattern Analysis of EEG Signals. INTERSPEECH 2020: 2772-2776 - [c555]Kun Zhou, Berrak Sisman, Mingyang Zhang, Haizhou Li:
Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion. INTERSPEECH 2020: 3416-3420 - [c554]Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan, Haizhou Li:
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge. INTERSPEECH 2020: 3456-3460 - [c553]Nana Hou, Chenglin Xu, Van Tung Pham, Joey Tianyi Zhou, Eng Siong Chng, Haizhou Li:
Speaker and Phoneme-Aware Speech Bandwidth Extension with Residual Dual-Path Network. INTERSPEECH 2020: 4064-4068 - [c552]Nana Hou, Chenglin Xu, Joey Tianyi Zhou, Eng Siong Chng, Haizhou Li:
Multi-Task Learning for End-to-End Noise-Robust Bandwidth Extension. INTERSPEECH 2020: 4069-4073 - [c551]Rohan Kumar Das, Xiaohai Tian, Tomi Kinnunen, Haizhou Li:
The Attacker's Perspective on Automatic Speaker Verification: An Overview. INTERSPEECH 2020: 4213-4217 - [c550]Tianchi Liu, Rohan Kumar Das, Maulik C. Madhavi, Shengmei Shen, Haizhou Li:
Speaker-Utterance Dual Attention for Speaker and Utterance Verification. INTERSPEECH 2020: 4293-4297 - [c549]Xinyuan Zhou, Grandee Lee, Emre Yilmaz, Yanhua Long, Jiaen Liang, Haizhou Li:
Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-Based LVCSR. INTERSPEECH 2020: 5016-5020 - [c548]Chitralekha Gupta, Lin Huang, Haizhou Li:
Automatic Rank-Ordering of Singing Vocals with Twin-Neural Network. ISMIR 2020: 416-423 - [c547]Chen Zhang, Luis Fernando D'Haro, Rafael E. Banchs, Thomas Friedrichs, Haizhou Li:
Deep AM-FM: Toolkit for Automatic Dialogue Evaluation. IWSDS 2020: 53-69 - [c546]Xiaohai Tian, Rohan Kumar Das, Haizhou Li:
Black-box Attacks on Automatic Speaker Verification using Feedback-controlled Voice Conversion. Odyssey 2020: 159-164 - [c545]Kun Zhou, Berrak Sisman, Haizhou Li:
Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data. Odyssey 2020: 230-237 - [c544]Berrak Sisman, Haizhou Li:
Generative Adversarial Networks for Singing Voice Conversion with and without Parallel Data. Odyssey 2020: 238-244 - [c543]Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss. Odyssey 2020: 245-251 - [c542]Xiaoxue Gao, Xiaohai Tian, Yi Zhou, Rohan Kumar Das, Haizhou Li:
Personalized Singing Voice Generation Using WaveRNN. Odyssey 2020: 252-258 - [i61]Kun Zhou, Berrak Sisman, Haizhou Li:
Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data. CoRR abs/2002.00198 (2020) - [i60]Xiaoyi Qin, Ming Li, Hui Bu, Rohan Kumar Das, Wei Rao, Shrikanth Narayanan, Haizhou Li:
The FFSVC 2020 Evaluation Plan. CoRR abs/2002.00387 (2020) - [i59]Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss. CoRR abs/2002.00417 (2020) - [i58]Malu Zhang, Jiadong Wang, Zhixuan Zhang, Ammar Belatreche, Jibin Wu, Yansong Chua, Hong Qu, Haizhou Li:
Spike-Timing-Dependent Back Propagation in Deep Spiking Neural Networks. CoRR abs/2003.11837 (2020) - [i57]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
SpEx: Multi-Scale Time Domain Speaker Extraction Network. CoRR abs/2004.08326 (2020) - [i56]Rohan Kumar Das, Xiaohai Tian, Tomi Kinnunen, Haizhou Li:
The Attacker's Perspective on Automatic Speaker Verification: An Overview. CoRR abs/2004.08849 (2020) - [i55]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
Time-domain speaker extraction network. CoRR abs/2004.14762 (2020) - [i54]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
SpEx+: A Complete Time Domain Speaker Extraction Network. CoRR abs/2005.04686 (2020) - [i53]Kun Zhou, Berrak Sisman, Mingyang Zhang, Haizhou Li:
Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion. CoRR abs/2005.07025 (2020) - [i52]Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan, Haizhou Li:
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge. CoRR abs/2005.08046 (2020) - [i51]Srivatsa P, Kyle Timothy Ng Chu, Yaswanth Tavva, Jibin Wu, Malu Zhang, Haizhou Li, Trevor E. Carlson:
You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy. CoRR abs/2006.09982 (2020) - [i50]Xinyuan Zhou, Grandee Lee, Emre Yilmaz, Yanhua Long, Jiaen Liang, Haizhou Li:
Self-and-Mixed Attention Decoder with Deep Acoustic Structure for Transformer-based LVCSR. CoRR abs/2006.10407 (2020) - [i49]Xinyuan Zhou, Emre Yilmaz, Yanhua Long, Yijie Li, Haizhou Li:
Multi-Encoder-Decoder Transformer for Code-Switching Speech Recognition. CoRR abs/2006.10414 (2020) - [i48]Jibin Wu, Chenglin Xu, Daquan Zhou, Haizhou Li, Kay Chen Tan:
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks. CoRR abs/2007.01204 (2020) - [i47]Zihan Pan, Malu Zhang, Jibin Wu, Haizhou Li:
Multi-Tones' Phase Coding (MTPC) of Interaural Time Difference by Spiking Neural Network. CoRR abs/2007.03274 (2020) - [i46]Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li:
Expressive TTS Training with Frame and Style Reconstruction Loss. CoRR abs/2008.01490 (2020) - [i45]Berrak Sisman, Junichi Yamagishi, Simon King, Haizhou Li:
An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning. CoRR abs/2008.03648 (2020) - [i44]Junchen Lu, Kun Zhou, Berrak Sisman, Haizhou Li:
VAW-GAN for Singing Voice Conversion with Non-parallel Training Data. CoRR abs/2008.03992 (2020) - [i43]Zongyang Du, Kun Zhou, Berrak Sisman, Haizhou Li:
Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN. CoRR abs/2008.04562 (2020) - [i42]Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li:
Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS. CoRR abs/2008.05284 (2020) - [i41]Tianchi Liu, Rohan Kumar Das, Maulik C. Madhavi, Shengmei Shen, Haizhou Li:
Speaker-Utterance Dual Attention for Speaker and Utterance Verification. CoRR abs/2008.08901 (2020) - [i40]Zexu Pan, Zhaojie Luo, Jichen Yang, Haizhou Li:
Multi-modal Attention for Speech Emotion Recognition. CoRR abs/2009.04107 (2020) - [i39]Mingyang Zhang, Yi Zhou, Li Zhao, Haizhou Li:
Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data. CoRR abs/2009.14399 (2020) - [i38]Rohan Kumar Das, Ruijie Tao, Jichen Yang, Wei Rao, Cheng Yu, Haizhou Li:
HLT-NUS Submission for NIST 2019 Multimedia Speaker Recognition Evaluation. CoRR abs/2010.03905 (2020) - [i37]Rohan Kumar Das, Haizhou Li:
Classification of Speech with and without Face Mask using Acoustic Features. CoRR abs/2010.03907 (2020) - [i36]Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li:
Muse: Multi-modal target speaker extraction with visual cues. CoRR abs/2010.07775 (2020) - [i35]Rui Liu, Berrak Sisman, Haizhou Li:
GraphSpeech: Syntax-Aware Graph Attention Network For Neural Speech Synthesis. CoRR abs/2010.12423 (2020) - [i34]Kun Zhou, Berrak Sisman, Rui Liu, Haizhou Li:
Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset. CoRR abs/2010.14794 (2020) - [i33]Kun Zhou, Berrak Sisman, Haizhou Li:
VAW-GAN for Disentanglement and Recomposition of Emotional Elements in Speech. CoRR abs/2011.02314 (2020) - [i32]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
Optimizing voice conversion network with cycle consistency loss of speaker identity. CoRR abs/2011.08548 (2020) - [i31]Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
Multi-stage Speaker Extraction with Utterance and Frame-Level Reference Signals. CoRR abs/2011.09624 (2020) - [i30]Bidisha Sharma, Xiaoxue Gao, Karthika Vijayan, Xiaohai Tian, Haizhou Li:
NHSS: A Speech and Singing Parallel Database. CoRR abs/2012.00337 (2020)
2010 – 2019
- 2019
- [j104]Luis Fernando D'Haro, Rafael E. Banchs, Chiori Hori, Haizhou Li:
Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics. Comput. Speech Lang. 55: 200-215 (2019) - [j103]Karthika Vijayan, Haizhou Li, Tomoki Toda:
Speech-to-Singing Voice Conversion: The Challenges and Strategies for Improving Vocal Conversion Processes. IEEE Signal Process. Mag. 36(1): 95-102 (2019) - [j102]Berrak Sisman, Mingyang Zhang, Haizhou Li:
Group Sparse Representation With WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 27(6): 1085-1097 (2019) - [j101]Qiang Yu, Haizhou Li, Kay Chen Tan:
Spike Timing or Rate? Neurons Learn to Make Decisions for Both Through Threshold-Driven Plasticity. IEEE Trans. Cybern. 49(6): 2178-2189 (2019) - [j100]Chong Zhang, Kay Chen Tan, Haizhou Li, Geok Soon Hong:
A Cost-Sensitive Deep Belief Network for Imbalanced Classification. IEEE Trans. Neural Networks Learn. Syst. 30(1): 109-122 (2019) - [c541]Malu Zhang, Jibin Wu, Yansong Chua, Xiaoling Luo, Zihan Pan, Dan Liu, Haizhou Li:
MPD-AL: An Efficient Membrane Potential Driven Aggregate-Label Learning Algorithm for Spiking Neurons. AAAI 2019: 1327-1334 - [c540]Berrak Sisman, Karthika Vijayan, Minghui Dong, Haizhou Li:
SINGAN: Singing Voice Conversion with Generative Adversarial Networks. APSIPA 2019: 112-118 - [c539]Xiaoxue Gao, Xiaohai Tian, Rohan Kumar Das, Yi Zhou, Haizhou Li:
Speaker-independent Spectral Mapping for Speech-to-Singing Conversion. APSIPA 2019: 159-164 - [c538]Nana Hou, Chenglin Xu, Eng Siong Chng, Haizhou Li:
Domain Adversarial Training for Speech Enhancement. APSIPA 2019: 667-672 - [c537]Yitong Liu, Rohan Kumar Das, Haizhou Li:
Multi-band Spectral Entropy Information for Detection of Replay Attacks. APSIPA 2019: 838-843 - [c536]Karthika Vijayan, K. Sri Rama Murty, Haizhou Li:
Allpass Modeling of Phase Spectrum of Speech Signals for Formant Tracking. APSIPA 2019: 1190-1196 - [c535]Yi Zhou, Xiaohai Tian, Rohan Kumar Das, Haizhou Li:
Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network. APSIPA 2019: 1282-1287 - [c534]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech. APSIPA 2019: 1630-1635 - [c533]Berrak Sisman, Mingyang Zhang, Minghui Dong, Haizhou Li:
On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion. ASRU 2019: 144-151 - [c532]Hongqiang Du, Xiaohai Tian, Lei Xie, Haizhou Li:
WaveNet Factorization with Singular Value Decomposition for Voice Conversion. ASRU 2019: 152-159 - [c531]Yi Zhou, Xiaohai Tian, Emre Yilmaz, Rohan Kumar Das, Haizhou Li:
A Modularized Neural Network with Language-Specific Output Layers for Cross-Lingual Voice Conversion. ASRU 2019: 160-167 - [c530]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
Time-Domain Speaker Extraction Network. ASRU 2019: 327-334 - [c529]Xianghu Yue, Grandee Lee, Emre Yilmaz, Fang Deng, Haizhou Li:
End-to-End Code-Switching ASR for Low-Resourced Language Pairs. ASRU 2019: 972-979 - [c528]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Long Range Acoustic and Deep Features Perspective on ASVspoof 2019. ASRU 2019: 1018-1025 - [c527]Bidisha Sharma, Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models. ICASSP 2019: 396-400 - [c526]Buddhi Wickramasinghe, Eliathamby Ambikairajah, Julien Epps, Vidhyasaharan Sethu, Haizhou Li:
Auditory Inspired Spatial Differentiation for Replay Spoofing Attack Detection. ICASSP 2019: 6011-6015 - [c525]Grandee Lee, Haizhou Li:
Word and Class Common Space Embedding for Code-switch Language Modelling. ICASSP 2019: 6086-6090 - [c524]Yi Zhou, Xiaohai Tian, Haihua Xu, Rohan Kumar Das, Haizhou Li:
Cross-lingual Voice Conversion with Bilingual Phonetic Posteriorgram and Average Modeling. ICASSP 2019: 6790-6794 - [c523]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss. ICASSP 2019: 6990-6994 - [c522]Zihan Pan, Jibin Wu, Malu Zhang, Haizhou Li, Yansong Chua:
Neural Population Coding for Effective Temporal Classification. IJCNN 2019: 1-8 - [c521]Jibin Wu, Yansong Chua, Malu Zhang, Qu Yang, Guoqi Li, Haizhou Li:
Deep Spiking Neural Network with Spike Count based Learning Rule. IJCNN 2019: 1-6 - [c520]Jibin Wu, Malu Zhang, Haizhou Li, Yansong Chua:
Competitive STDP-based Feature Representation Learning for Sound Event Classification. IJCNN 2019: 1-8 - [c519]Xiaohai Tian, Eng Siong Chng, Haizhou Li:
A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data. INTERSPEECH 2019: 201-205 - [c518]Emre Yilmaz, Adem Derinel, Kun Zhou, Henk van den Heuvel, Niko Brummer, Haizhou Li, David A. van Leeuwen:
Large-Scale Speaker Diarization of Radio Broadcast Archives. INTERSPEECH 2019: 411-415 - [c517]Bidisha Sharma, Haizhou Li:
A Combination of Model-Based and Feature-Based Strategy for Speech-to-Singing Alignment. INTERSPEECH 2019: 624-628 - [c516]Rohan Kumar Das, Jichen Yang, Haizhou Li:
Long Range Acoustic Features for Spoofed Speech Detection. INTERSPEECH 2019: 1058-1062 - [c515]Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura:
VQVAE Unsupervised Unit Discovery and Multi-Scale Code2Spec Inverter for Zerospeech Challenge 2019. INTERSPEECH 2019: 1118-1122 - [c514]Wei Rao, Chenglin Xu, Eng Siong Chng, Haizhou Li:
Target Speaker Extraction for Multi-Talker Speaker Verification. INTERSPEECH 2019: 1273-1277 - [c513]Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi:
Joint Training Framework for Text-to-Speech and Voice Conversion Using Multi-Source Tacotron and WaveNet. INTERSPEECH 2019: 1298-1302 - [c512]Kong Aik Lee, Ville Hautamäki, Tomi H. Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado, Massimiliano Todisco:
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences. INTERSPEECH 2019: 1497-1501 - [c511]Bidisha Sharma, Rohan Kumar Das, Haizhou Li:
Multi-Level Adaptive Speech Activity Detector for Speech in Naturalistic Environments. INTERSPEECH 2019: 2015-2019 - [c510]Bidisha Sharma, Rohan Kumar Das, Haizhou Li:
On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music. INTERSPEECH 2019: 2020-2024 - [c509]Chitralekha Gupta, Emre Yilmaz, Haizhou Li:
Acoustic Modeling for Automatic Lyrics-to-Audio Alignment. INTERSPEECH 2019: 2040-2044 - [c508]Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng, Haizhou Li:
On the End-to-End Solution to Mandarin-English Code-Switching Speech Recognition. INTERSPEECH 2019: 2165-2169 - [c507]Chitralekha Gupta, Karthika Vijayan, Bidisha Sharma, Xiaoxue Gao, Haizhou Li:
NUS Speak-to-Sing: A Web Platform for Personalized Speech-to-Singing Conversion. INTERSPEECH 2019: 2376-2377 - [c506]Rohan Kumar Das, Haizhou Li:
Instantaneous Phase and Long-Term Acoustic Cues for Orca Activity Detection. INTERSPEECH 2019: 2418-2422 - [c505]Tharshini Gunendradasan, Eliathamby Ambikairajah, Julien Epps, Haizhou Li:
An Adaptive-Q Cochlear Model for Replay Spoofing Detection. INTERSPEECH 2019: 2918-2922 - [c504]Jibin Wu, Zihan Pan, Malu Zhang, Rohan Kumar Das, Yansong Chua, Haizhou Li:
Robust Sound Recognition: A Neuromorphic Approach. INTERSPEECH 2019: 3667-3668 - [c503]Grandee Lee, Xianghu Yue, Haizhou Li:
Linguistically Motivated Parallel Data Augmentation for Code-Switch Language Modeling. INTERSPEECH 2019: 3730-3734 - [c502]Qinyi Wang, Emre Yilmaz, Adem Derinel, Haizhou Li:
Code-Switching Detection Using ASR-Generated Language Posteriors. INTERSPEECH 2019: 3740-3744 - [c501]Emre Yilmaz, Samuel Cohen, Xianghu Yue, David A. van Leeuwen, Haizhou Li:
Multi-Graph Decoding for Code-Switching ASR. INTERSPEECH 2019: 3750-3754 - [c500]Tianchi Liu, Maulik C. Madhavi, Rohan Kumar Das, Haizhou Li:
A Unified Framework for Speaker and Utterance Verification. INTERSPEECH 2019: 4320-4324 - [c499]Maulik C. Madhavi, Tong Zhan, Haizhou Li, Min Yuan:
First Leap Towards Development of Dialogue System for Autonomous Bus. IWSDS 2019: 393-400 - [c498]Haizhou Li:
Country Report - Singapore. O-COCOSDA 2019: 1-6 - [c497]Rohan Sheelvant, Bidisha Sharma, Maulik C. Madhavi, Rohan Kumar Das, S. R. M. Prasanna, Haizhou Li:
RSL2019: A Realistic Speech Localization Corpus. O-COCOSDA 2019: 1-6 - [e16]Luis Fernando D'Haro, Rafael E. Banchs, Haizhou Li:
9th International Workshop on Spoken Dialogue System Technology, IWSDS 2018, Singapore, April 18-20, 2018. Lecture Notes in Electrical Engineering 579, Springer 2019, ISBN 978-981-13-9442-3 [contents] - [i29]Wei Rao, Chenglin Xu, Eng Siong Chng, Haizhou Li:
Target Speaker Extraction for Overlapped Multi-Talker Speaker Verification. CoRR abs/1902.02546 (2019) - [i28]Xiaohai Tian, Eng Siong Chng, Haizhou Li:
A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data. CoRR abs/1902.03705 (2019) - [i27]Jibin Wu, Yansong Chua, Malu Zhang, Qu Yang, Guoqi Li, Haizhou Li:
Deep Spiking Neural Network with Spike Count based Learning Rule. CoRR abs/1902.05705 (2019) - [i26]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
Optimization of Speaker Extraction Neural Network with Magnitude and Temporal Spectrum Approximation Loss. CoRR abs/1903.09952 (2019) - [i25]Mingyang Zhang, Xin Wang, Fuming Fang, Haizhou Li, Junichi Yamagishi:
Joint training framework for text-to-speech and voice conversion using multi-source Tacotron and WaveNet. CoRR abs/1903.12389 (2019) - [i24]Kong Aik Lee, Ville Hautamäki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Héctor Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda, Trung Ngo Trong, Md. Sahidullah, Fan Lu, Yun Tang, Ming Tu, Kah Kuan Teh, Tran Huy Dat, Kuruvachan K. George, Ivan Kukanov, Florent Desnous, Jichen Yang, Emre Yilmaz, Longting Xu, Jean-François Bonastre, Chenglin Xu, Zhi Hao Lim, Eng Siong Chng, Shivesh Ranjan, John H. L. Hansen, Massimiliano Todisco, Nicholas W. D. Evans:
I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences. CoRR abs/1904.07386 (2019) - [i23]Andros Tjandra, Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura:
VQVAE Unsupervised Unit Discovery and Multi-scale Code2Spec Inverter for Zerospeech Challenge 2019. CoRR abs/1905.11449 (2019) - [i22]Emre Yilmaz, Samuel Cohen, Xianghu Yue, David A. van Leeuwen, Haizhou Li:
Multi-Graph Decoding for Code-Switching ASR. CoRR abs/1906.07523 (2019) - [i21]Emre Yilmaz, Adem Derinel, Kun Zhou, Henk van den Heuvel, Niko Brummer, Haizhou Li, David A. van Leeuwen:
Large-Scale Speaker Diarization of Radio Broadcast Archives. CoRR abs/1906.07955 (2019) - [i20]Qinyi Wang, Emre Yilmaz, Adem Derinel, Haizhou Li:
Code-Switching Detection Using ASR-Generated Language Posteriors. CoRR abs/1906.08003 (2019) - [i19]Chitralekha Gupta, Emre Yilmaz, Haizhou Li:
Acoustic Modeling for Automatic Lyrics-to-Audio Alignment. CoRR abs/1906.10369 (2019) - [i18]Jibin Wu, Yansong Chua, Malu Zhang, Guoqi Li, Haizhou Li, Kay Chen Tan:
A Hybrid Learning Rule for Efficient and Rapid Inference with Spiking Neural Networks. CoRR abs/1907.01167 (2019) - [i17]Zihan Pan, Yansong Chua, Jibin Wu, Malu Zhang, Haizhou Li, Eliathamby Ambikairajah:
An efficient and perceptually motivated auditory neural encoding and decoding algorithm for spiking neural networks. CoRR abs/1909.01302 (2019) - [i16]Zihan Pan, Jibin Wu, Yansong Chua, Malu Zhang, Haizhou Li:
Neural Population Coding for Effective Temporal Classification. CoRR abs/1909.08018 (2019) - [i15]Chitralekha Gupta, Emre Yilmaz, Haizhou Li:
Automatic Lyrics Transcription in Polyphonic Music: Does Background Music Help? CoRR abs/1909.10200 (2019) - [i14]Xianghu Yue, Grandee Lee, Emre Yilmaz, Fang Deng, Haizhou Li:
End-to-End Code-Switching ASR for Low-Resourced Language Pairs. CoRR abs/1909.12681 (2019) - [i13]Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao, Haizhou Li:
Teacher-Student Training for Robust Tacotron-based TTS. CoRR abs/1911.02839 (2019) - [i12]Jibin Wu, Emre Yilmaz, Malu Zhang, Haizhou Li, Kay Chen Tan:
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech Recognition. CoRR abs/1911.08373 (2019) - [i11]Van Tung Pham, Haihua Xu, Yerbolat Khassanov, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma, Haizhou Li:
Independent language modeling architecture for end-to-end ASR. CoRR abs/1912.00863 (2019) - 2018
- [j99]Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Haizhou Li:
Using language cluster models in hierarchical language identification. Speech Commun. 100: 30-40 (2018) - [j98]Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li:
Re-ranking spoken term detection with acoustic exemplars of keywords. Speech Commun. 104: 12-23 (2018) - [j97]Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang:
Generalizing I-Vector Estimation for Rapid Speaker Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 26(4): 749-759 (2018) - [j96]Haizhou Li:
Farewell Editorial. IEEE ACM Trans. Audio Speech Lang. Process. 26(12): 2489 (2018) - [c496]Zhongwei Li, Xuancong Wang, AiTi Aw, Eng Siong Chng, Haizhou Li:
Named-Entity Tagging and Domain adaptation for Better Customized Translation. NEWS@ACL 2018: 41-46 - [c495]Nancy F. Chen, Xiangyu Duan, Min Zhang, Rafael E. Banchs, Haizhou Li:
NEWS 2018 Whitepaper. NEWS@ACL 2018: 47-54 - [c494]Nancy F. Chen, Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li:
Report of NEWS 2018 Named Entity Transliteration Shared Task. NEWS@ACL 2018: 55-73 - [c493]Mingyang Zhang, Berrak Sisman, Sai Sirisha Rallabandi, Haizhou Li, Li Zhao:
Error Reduction Network for DBLSTM-based Voice Conversion. APSIPA 2018: 823-828 - [c492]Yanping Li, Kong-Aik Lee, Yougen Yuan, Haizhou Li, Zhen Yang:
Many-to-Many Voice Conversion based on Bottleneck Features with Variational Autoencoder for Non-parallel Training Data. APSIPA 2018: 829-833 - [c491]Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Evaluation of Singing Quality without a Reference. APSIPA 2018: 990-997 - [c490]Jichen Yang, Rohan Kumar Das, Haizhou Li:
Extended Constant-Q Cepstral Coefficients for Detection of Spoofing Attacks. APSIPA 2018: 1024-1029 - [c489]Rohan Kumar Das, Haizhou Li:
Instantaneous Phase and Excitation Source Features for Detection of Replay Attacks. APSIPA 2018: 1030-1037 - [c488]Gajan Suthokumar, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Chamith Wijenayake, Eliathamby Ambikairajah, Haizhou Li:
Use of Claimed Speaker Models for Replay Detection. APSIPA 2018: 1038-1046 - [c487]Sarith Fernando, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Haizhou Li:
Second Order Factorized Model Adaptation for Short Duration Language Identification. APSIPA 2018: 1440-1447 - [c486]Rohan Kumar Das, Maulik C. Madhavi, Haizhou Li:
Compensating Utterance Information in Fixed Phrase Speaker Verification. APSIPA 2018: 1708-1712 - [c485]Karthika Vijayan, Xiaoxue Gao, Haizhou Li:
Analysis of Speech and Singing Signals for Temporal Alignment. APSIPA 2018: 1893-1898 - [c484]Jinba Xiao, Shan Yang, Mingyang Zhang, Berrak Sisman, Dongyan Huang, Lei Xie, Minghui Dong, Haizhou Li:
The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018. Blizzard Challenge 2018 - [c483]Chenglin Xu, Wei Rao, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Single Channel Speech Separation with Constrained Utterance Level Permutation Invariant Training Using Grid LSTM. ICASSP 2018: 6-10 - [c482]Qing Wang, Wei Rao, Sining Sun, Lei Xie, Eng Siong Chng, Haizhou Li:
Unsupervised Domain Adaptation via Domain Adversarial Training for Speaker Recognition. ICASSP 2018: 4889-4893 - [c481]Karthika Vijayan, Haizhou Li, Hanwu Sun, Kong-Aik Lee:
On the Importance of Analytic Phase of Speech Signals in Spoken Language Recognition. ICASSP 2018: 5194-5198 - [c480]Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Haizhou Li:
End-to-End Hierarchical Language Identification System. ICASSP 2018: 5199-5203 - [c479]Zihan Pan, Haizhou Li, Jibin Wu, Yansong Chua:
An Event-Based Cochlear Filter Temporal Encoding Scheme for Speech Signals. IJCNN 2018: 1-8 - [c478]Jibin Wu, Yansong Chua, Haizhou Li:
A Biologically Plausible Speech Recognition Framework Based on Spiking Neural Networks. IJCNN 2018: 1-8 - [c477]Berrak Sisman, Haizhou Li:
Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion. INTERSPEECH 2018: 52-56 - [c476]Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma, Haizhou Li:
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search. INTERSPEECH 2018: 97-101 - [c475]Haihua Xu, Van Tung Pham, Zin Tun Kyaw, Zhi Hao Lim, Eng Siong Chng, Haizhou Li:
Mandarin-English Code-switching Speech Recognition. INTERSPEECH 2018: 554-555 - [c474]Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang:
Co-whitening of I-vectors for Short and Long Duration Speaker Verification. INTERSPEECH 2018: 1066-1070 - [c473]Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Pronunciation Evaluation of Singing. INTERSPEECH 2018: 1507-1511 - [c472]Berrak Sisman, Mingyang Zhang, Haizhou Li:
A Voice Conversion Framework with Tandem Feature Sparse Representation and Speaker-Adapted WaveNet Vocoder. INTERSPEECH 2018: 1978-1982 - [c471]Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
A Shifted Delta Coefficient Objective for Monaural Speech Separation Using Multi-task Learning. INTERSPEECH 2018: 3479-3483 - [c470]Chitralekha Gupta, Rong Tong, Haizhou Li, Ye Wang:
Semi-supervised Lyrics and Solo-singing Alignment. ISMIR 2018: 600-607 - [c469]Xiaohai Tian, Junchao Wang, Haihua Xu, Eng Siong Chng, Haizhou Li:
Average Modeling Approach to Voice Conversion with Non-Parallel Data. Odyssey 2018: 227-232 - [c468]Berrak Sisman, Grandee Lee, Haizhou Li:
Phonetically Aware Exemplar-Based Prosody Transformation. Odyssey 2018: 267-274 - [c467]Berrak Sisman, Mingyang Zhang, Sakriani Sakti, Haizhou Li, Satoshi Nakamura:
Adaptive Wavenet Vocoder for Residual Compensation in GAN-Based Voice Conversion. SLT 2018: 282-289 - [c466]Longting Xu, Rohan Kumar Das, Emre Yilmaz, Jichen Yang, Haizhou Li:
Generative X-Vectors for Text-Independent Speaker Verification. SLT 2018: 1014-1020 - [e15]Nancy F. Chen, Rafael E. Banchs, Xiangyu Duan, Min Zhang, Haizhou Li:
Proceedings of the Seventh Named Entities Workshop, NEWS@ACL 2018, Melbourne, Australia, July 20, 2018. Association for Computational Linguistics 2018, ISBN 978-1-948087-37-7 [contents] - [i10]Chong Zhang, Kay Chen Tan, Haizhou Li, Geok Soon Hong:
A Cost-Sensitive Deep Belief Network for Imbalanced Classification. CoRR abs/1804.10801 (2018) - [i9]Chong Zhang, Geok Soon Hong, Jun-Hong Zhou, Kay Chen Tan, Haizhou Li, Huan Xu, Jihoon Hong, Hian-Leng Chan:
A Multi-State Diagnosis and Prognosis Framework with Feature Learning for Tool Condition Monitoring. CoRR abs/1805.00367 (2018) - [i8]Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma, Haizhou Li:
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search. CoRR abs/1806.03621 (2018) - [i7]Laxmi R. Iyer, Yansong Chua, Haizhou Li:
Is Neuromorphic MNIST neuromorphic? Analyzing the discriminative power of neuromorphic datasets in the time domain. CoRR abs/1807.01013 (2018) - [i6]Longting Xu, Rohan Kumar Das, Emre Yilmaz, Jichen Yang, Haizhou Li:
Generative x-vectors for text-independent speaker verification. CoRR abs/1809.06798 (2018) - [i5]Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Haihua Xu, Eng Siong Chng, Haizhou Li:
On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition. CoRR abs/1811.00241 (2018) - 2017
- [j95]Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Haizhou Li:
Front-End for Antispoofing Countermeasures in Speaker Verification: Scattering Spectral Decomposition. IEEE J. Sel. Top. Signal Process. 11(4): 632-643 (2017) - [j94]Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Multitask Feature Learning for Low-Resource Query-by-Example Spoken Term Detection. IEEE J. Sel. Top. Signal Process. 11(8): 1329-1339 (2017) - [j93]Hongjie Chen, Lei Xie, Cheung-Chi Leung, Xiaoming Lu, Bin Ma, Haizhou Li:
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News. IEEE ACM Trans. Audio Speech Lang. Process. 25(1): 108-119 (2017) - [j92]Xiaohai Tian, Siu Wa Lee, Zhizheng Wu, Eng Siong Chng, Haizhou Li:
An Exemplar-Based Approach to Frequency Warping for Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 25(10): 1863-1876 (2017) - [c465]Luis Fernando D'Haro, Andreea I. Niculescu, Caixia Cai, Suraj Nair, Rafael E. Banchs, Alois C. Knoll, Haizhou Li:
An integrated framework for multimodal human-robot interaction. APSIPA 2017: 76-82 - [c464]Chitralekha Gupta, Haizhou Li, Ye Wang:
Perceptual evaluation of singing quality. APSIPA 2017: 577-586 - [c463]Nancy F. Chen, Boon Pang Lim, Van Hai Do, Van Tung Pham, Chongjia Ni, Haihua Xu, Mark Hasegawa-Johnson, Wenda Chen, Xiong Xiao, Sunil Sivadas, Eng Siong Chng, Bin Ma, Haizhou Li:
Low-resource spoken keyword search strategies in georgian inspired by distinctive feature theory. APSIPA 2017: 1322-1327 - [c462]Berrak Sisman, Haizhou Li, Kay Chen Tan:
Transformation of prosody in voice conversion. APSIPA 2017: 1537-1546 - [c461]Karthika Vijayan, Minghui Dong, Haizhou Li:
A dual alignment scheme for improved speech-to-singing voice conversion. APSIPA 2017: 1547-1555 - [c460]Hanwu Sun, Kong-Aik Lee, Trung Hieu Nguyen, Bin Ma, Haizhou Li:
I2R-NUS submission to oriental language recognition AP16-OL7 challenge. APSIPA 2017: 1574-1578 - [c459]Zhiping Zeng, Haihua Xu, Tze Yuang Chong, Eng Siong Chng, Haizhou Li:
Improving N-gram language modeling for code-switching speech recognition. APSIPA 2017: 1596-1601 - [c458]Berrak Sisman, Haizhou Li, Kay Chen Tan:
Sparse representation of phonetic features for voice conversion with and without parallel data. ASRU 2017: 677-684 - [c457]Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang, Haizhou Li:
Statistical parametric speech synthesis using generative adversarial networks under a multi-task learning framework. ASRU 2017: 685-691 - [c456]Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Multilingual bottle-neck feature learning from untranscribed speech. ASRU 2017: 727-733 - [c455]Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma, Haizhou Li:
Extracting bottleneck features and word-like pairs from untranscribed speech for feature representation. ASRU 2017: 734-739 - [c454]Chong Zhang, Geok Soon Hong, Huan Xu, Kay Chen Tan, Jun-Hong Zhou, Hian-Leng Chan, Haizhou Li:
A data-driven prognostics framework for tool remaining useful life estimation in tool condition monitoring. ETFA 2017: 1-8 - [c453]Berrak Sisman, Grandee Lee, Haizhou Li, Kay Chen Tan:
On the analysis and evaluation of prosody conversion techniques. IALP 2017: 44-47 - [c452]Nana Hou, Xiaohai Tian, Eng Siong Chng, Bin Ma, Haizhou Li:
Improving air traffic control speech intelligibility by reducing speaking rate effectively. IALP 2017: 197-200 - [c451]Grandee Lee, Thi-Nga Ho, Eng Siong Chng, Haizhou Li:
A review of the mandarin-english code-switching corpus: SEAME. IALP 2017: 210-213 - [c450]Zhongwei Li, Eng Siong Chng, Haizhou Li:
Named entity transliteration with sequence-to-sequence neural network. IALP 2017: 374-378 - [c449]Xiong Xiao, Shengkui Zhao, Douglas L. Jones, Eng Siong Chng, Haizhou Li:
On time-frequency mask estimation for MVDR beamforming with application in robust speech recognition. ICASSP 2017: 3246-3250 - [c448]Liping Chen, Kong-Aik Lee, Bin Ma, Long Ma, Haizhou Li, Li-Rong Dai:
Adaptation of PLDA for multi-source text-independent speaker verification. ICASSP 2017: 5380-5384 - [c447]Yougen Yuan, Cheung-Chi Leung, Lei Xie, Hongjie Chen, Bin Ma, Haizhou Li:
Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection. ICASSP 2017: 5645-5649 - [c446]Haizhou Li:
ISCA Medal for Scientific Achievement. INTERSPEECH 2017: 1 - [c445]Dong-Yan Huang, Wan Ding, Mingyu Xu, Huaiping Ming, Minghui Dong, Xinguo Yu, Haizhou Li:
Multimodal Prediction of Affective Dimensions via Fusing Multiple Regression Techniques. INTERSPEECH 2017: 162-165 - [c444]Kong-Aik Lee, Ville Hautamäki, Tomi Kinnunen, Anthony Larcher, Chunlei Zhang, Andreas Nautsch, Themos Stafylakis, Gang Liu, Mickaël Rouvier, Wei Rao, Federico Alegre, J. Ma, Man-Wai Mak, Achintya Kumar Sarkar, Héctor Delgado, Rahim Saeidi, Hagai Aronowitz, Aleksandr Sizov, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Bin Ma, Ville Vestman, Md. Sahidullah, M. Halonen, Anssi Kanervisto, Gaël Le Lan, Fahimeh Bahmaninezhad, Sergey Isadskiy, Christian Rathgeb, Christoph Busch, Georgios Tzimiropoulos, Q. Qian, Z. Wang, Q. Zhao, T. Wang, H. Li, J. Xue, S. Zhu, R. Jin, T. Zhao, Pierre-Michel Bousquet, Moez Ajili, Waad Ben Kheder, Driss Matrouf, Zhi Hao Lim, Chenglin Xu, Haihua Xu, Xiong Xiao, Eng Siong Chng, Benoit G. B. Fauve, Kaavya Sriskandaraja, Vidhyasaharan Sethu, W. W. Lin, Dennis Alexander Lehmann Thomsen, Zheng-Hua Tan, Massimiliano Todisco, Nicholas W. D. Evans, Haizhou Li, John H. L. Hansen, Jean-François Bonastre, Eliathamby Ambikairajah:
The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016. INTERSPEECH 2017: 1328-1332 - [c443]Kong-Aik Lee, Haizhou Li:
Gain Compensation for Fast i-Vector Extraction Over Short Duration. INTERSPEECH 2017: 1527-1531 - [c442]Chenglin Xu, Xiong Xiao, Sining Sun, Wei Rao, Eng Siong Chng, Haizhou Li:
Weighted Spatial Covariance Matrix Estimation for MUSIC Based TDOA Estimation of Speech Source. INTERSPEECH 2017: 1894-1898 - [c441]Saad Irtza, Vidhyasaharan Sethu, Eliathamby Ambikairajah, Haizhou Li:
Investigating Scalability in Hierarchical Language Identification System. INTERSPEECH 2017: 2581-2585 - [c440]Jie Wu, Dong-Yan Huang, Lei Xie, Haizhou Li:
Denoising Recurrent Neural Network for Deep Bidirectional LSTM Based Voice Conversion. INTERSPEECH 2017: 3379-3383 - [i4]Shan Yang, Lei Xie, Xiao Chen, Xiaoyan Lou, Xuan Zhu, Dongyan Huang, Haizhou Li:
Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework. CoRR abs/1707.01670 (2017) - 2016
- [j91]Jun Hu, Huajin Tang, Kay Chen Tan, Haizhou Li:
How the Brain Formulates Memory: A Spatio-Temporal Model Research Frontier. IEEE Comput. Intell. Mag. 11(2): 56-68 (2016) - [j90]Xiong Xiao, Shengkui Zhao, Duc Hoang Ha Nguyen, Xionghu Zhong, Douglas L. Jones, Eng Siong Chng, Haizhou Li:
Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation. EURASIP J. Adv. Signal Process. 2016: 4 (2016) - [j89]Zhizheng Wu, Haizhou Li:
On the study of replay and voice conversion attacks to text-dependent speaker verification. Multim. Tools Appl. 75(9): 5311-5327 (2016) - [j88]Nancy F. Chen, Darren Wee, Rong Tong, Bin Ma, Haizhou Li:
Large-scale characterization of non-native Mandarin Chinese spoken by speakers of European origin: Analysis on iCALL. Speech Commun. 84: 46-56 (2016) - [j87]Sven Ewan Shepstone, Kong-Aik Lee, Haizhou Li, Zheng-Hua Tan, Søren Holdt Jensen:
Total Variability Modeling Using Source-Specific Priors. IEEE ACM Trans. Audio Speech Lang. Process. 24(3): 504-517 (2016) - [j86]Duc Hoang Ha Nguyen, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Feature Adaptation Using Linear Spectro-Temporal Transform for Robust Speech Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 24(6): 1006-1019 (2016) - [j85]Qiang Yu, Rui Yan, Huajin Tang, Kay Chen Tan, Haizhou Li:
A Spiking Neural Network System for Robust Sequence Recognition. IEEE Trans. Neural Networks Learn. Syst. 27(3): 621-635 (2016) - [j84]Yuma Ueda, Longbiao Wang, Atsuhiko Kai, Xiong Xiao, Engsiong Chng, Haizhou Li:
Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization. J. Signal Process. Syst. 82(2): 151-161 (2016) - [j83]Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Exploration of Local Variability in Text-Independent Speaker Verification. J. Signal Process. Syst. 82(2): 217-228 (2016) - [c439]Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling for Dialogue Topic Tracking. ACL (1) 2016 - [c438]Ridong Jiang, Rafael E. Banchs, Haizhou Li:
Evaluating and Combining Name Entity Recognition Systems. NEWS@ACM 2016: 21-27 - [c437]Xiangyu Duan, Min Zhang, Haizhou Li, Rafael E. Banchs, A. Kumaran:
Whitepaper of NEWS 2016 Shared Task on Machine Transliteration. NEWS@ACM 2016: 49-57 - [c436]Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, A. Kumaran:
Report of NEWS 2016 Machine Transliteration Shared Task. NEWS@ACM 2016: 58-72 - [c435]Nancy F. Chen, Haizhou Li:
Computer-assisted pronunciation training: From pronunciation scoring towards spoken language learning. APSIPA 2016: 1-7 - [c434]Xiaohai Tian, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Spoofing speech detection using temporal convolutional neural network. APSIPA 2016: 1-6 - [c433]Xiong Xiao, Shinji Watanabe, Eng Siong Chng, Haizhou Li:
Beamforming networks using spatial covariance features for far-field speech recognition. APSIPA 2016: 1-6 - [c432]Haihua Xu, Wei Rao, Xiong Xiao, Hao Huang, Eng Siong Chng, Haizhou Li:
I-vector based deep neural network acoustic model adaptation using multilingual language resource. APSIPA 2016: 1-5 - [c431]Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Spoofing detection from a feature representation perspective. ICASSP 2016: 2119-2123 - [c430]Huaiping Ming, Dong-Yan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, Haizhou Li:
Exemplar-based sparse representation of timbre and prosody for voice conversion. ICASSP 2016: 5175-5179 - [c429]Liping Chen, Kong-Aik Lee, Eng Siong Chng, Bin Ma, Haizhou Li, Li-Rong Dai:
Content-aware local variability vector for speaker verification with short utterance. ICASSP 2016: 5485-5489 - [c428]Saad Irtza, Vidhyasaharan Sethu, Haris Bavattichalil, Eliathamby Ambikairajah, Haizhou Li:
A hierarchical framework for language identification. ICASSP 2016: 5820-5824 - [c427]Chongjia Ni, Cheung-Chi Leung, Lei Wang, Haibo Liu, Feng Rao, Li Lu, Nancy F. Chen, Bin Ma, Haizhou Li:
Cross-lingual deep neural network based submodular unbiased data selection for low-resource keyword search. ICASSP 2016: 6015-6019 - [c426]Haihua Xu, Jingyong Hou, Xiong Xiao, Van Tung Pham, Cheung-Chi Leung, Lei Wang, Van Hai Do, Hang Lv, Lei Xie, Bin Ma, Eng Siong Chng, Haizhou Li:
Approximate search of audio queries by using DTW with phone time boundary and data augmentation. ICASSP 2016: 6030-6034 - [c425]Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li:
Keyword search using query expansion for graph-based rescoring of hypothesized detections. ICASSP 2016: 6035-6039 - [c424]Nancy F. Chen, Van Tung Pham, Haihua Xu, Xiong Xiao, Van Hai Do, Chongjia Ni, I-Fan Chen, Sunil Sivadas, Chin-Hui Lee, Eng Siong Chng, Bin Ma, Haizhou Li:
Exemplar-inspired strategies for low-resource spoken keyword search in Swahili. ICASSP 2016: 6040-6044 - [c423]Xiong Xiao, Shengkui Zhao, Thi Ngoc Tho Nguyen, Douglas L. Jones, Eng Siong Chng, Haizhou Li:
An expectation-maximization eigenvector clustering approach to direction of arrival estimation of multiple speech sources. ICASSP 2016: 6330-6334 - [c422]Dong-Yan Huang, Minghui Dong, Haizhou Li:
Combining multiple kernel models for automatic intelligibility detection of pathological speech. ICASSP 2016: 6485-6489 - [c421]Wan Ding, Mingyu Xu, Dong-Yan Huang, Weisi Lin, Minghui Dong, Xinguo Yu, Haizhou Li:
Audio and face video emotion recognition in the wild using deep neural networks and small datasets. ICMI 2016: 506-513 - [c420]Yougen Yuan, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information. INTERSPEECH 2016: 788-792 - [c419]Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection. INTERSPEECH 2016: 923-927 - [c418]Van Tung Pham, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, Haizhou Li:
Rescoring Hypothesized Detections of Out-of-Vocabulary Keywords Using Subword Samples. INTERSPEECH 2016: 933-937 - [c417]Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li:
SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms. INTERSPEECH 2016: 1225-1229 - [c416]Haihua Xu, Hang Su, Chongjia Ni, Xiong Xiao, Hao Huang, Eng Siong Chng, Haizhou Li:
Semi-Supervised and Cross-Lingual Knowledge Transfer Learnings for DNN Hybrid Acoustic Models Under Low-Resource Conditions. INTERSPEECH 2016: 1315-1319 - [c415]Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li:
A DNN-HMM Approach to Story Segmentation. INTERSPEECH 2016: 1527-1531 - [c414]Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li:
SingaKids-Mandarin: Speech Corpus of Singaporean Children Speaking Mandarin Chinese. INTERSPEECH 2016: 1545-1549 - [c413]Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li:
An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions. INTERSPEECH 2016: 1715-1719 - [c412]Paul Yaozhu Chan, Minghui Dong, Grace Xue Hui Ho, Haizhou Li:
SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer. INTERSPEECH 2016: 1966-1967 - [c411]Huaiping Ming, Dong-Yan Huang, Lei Xie, Jie Wu, Minghui Dong, Haizhou Li:
Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion. INTERSPEECH 2016: 2453-2457 - [c410]Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li:
Context Aware Mispronunciation Detection for Mandarin Pronunciation Training. INTERSPEECH 2016: 3112-3116 - [c409]Kong-Aik Lee, Haizhou Li, Li Deng, Ville Hautamäki, Wei Rao, Xiong Xiao, Anthony Larcher, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Jianshu Chen, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Chenglin Xu, Haihua Xu, Bin Ma, Eng Siong Chng, Sylvain Meignier:
The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS. INTERSPEECH 2016: 3211-3215 - [c408]Saad Irtza, Vidhyasaharan Sethu, Sarith Fernando, Eliathamby Ambikairajah, Haizhou Li:
Out of Set Language Modelling in Hierarchical Language Identification. INTERSPEECH 2016: 3270-3274 - [c407]Chongjia Ni, Lei Wang, Cheung-Chi Leung, Feng Rao, Li Lu, Bin Ma, Haizhou Li:
Rapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search. INTERSPEECH 2016: 3698-3702 - [c406]Cheung-Chi Leung, Lei Wang, Haihua Xu, Jingyong Hou, Van Tung Pham, Hang Lv, Lei Xie, Xiong Xiao, Chongjia Ni, Bin Ma, Eng Siong Chng, Haizhou Li:
Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis. INTERSPEECH 2016: 3703-3707 - [c405]Wei Rao, Xiong Xiao, Chenglin Xu, Haihua Xu, Kong-Aik Lee, Eng Siong Chng, Haizhou Li:
Neural networks based channel compensation for i-vector speaker verification. ISCSLP 2016: 1-5 - [c404]Zhaofeng Zhang, Xiong Xiao, Longbiao Wang, Jianwu Dang, Masahiro Iwahashi, Eng Siong Chng, Haizhou Li:
Multi-channel feature adaptation for robust speech recognition. ISCSLP 2016: 1-5 - [c403]Lei Wang, Chongjia Ni, Cheung-Chi Leung, Changhuai You, Lei Xie, Haihua Xu, Xiong Xiao, Tin Lay Nwe, Eng Siong Chng, Bin Ma, Haizhou Li:
The NNI Vietnamese Speech Recognition System for MediaEval 2016. MediaEval 2016 - [c402]Haizhou Li:
Voice conversion and spoofing countermeasures for speaker verification. Odyssey 2016 - [c401]Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang:
Rapid Computation of I-vector. Odyssey 2016: 47-52 - [c400]Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Kong-Aik Lee, Bin Ma, Haizhou Li:
I2R Submission to the 2015 NIST Language Recognition I-vector Challenge. Odyssey 2016: 311-318 - [c399]Dong-Yan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Nguyen Quy Hy, Minghui Dong, Haizhou Li:
An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity. SSW 2016: 44-51 - [e14]Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, A. Kumaran:
Proceedings of the Sixth Named Entity Workshop, NEWS@ACL 2016, Berlin, Germany, August 12, 2016. Association for Computational Linguistics 2016, ISBN 978-1-945626-16-6 [contents] - [e13]Minghui Dong, Yuen-Hsien Tseng, Yanfeng Lu, Liang-Chih Yu, Lung-Hao Lee, Chung-Hsien Wu, Haizhou Li:
2016 International Conference on Asian Language Processing, IALP 2016, Tainan, Taiwan, November 21-23, 2016. IEEE 2016, ISBN 978-1-5090-0922-0 [contents] - [i3]Kong-Aik Lee, Ville Hautamäki, Anthony Larcher, Wei Rao, Hanwu Sun, Trung Hieu Nguyen, Guangsen Wang, Aleksandr Sizov, Ivan Kukanov, Amir Hossein Poorjam, Trung Ngo Trong, Xiong Xiao, Chenglin Xu, Haihua Xu, Bin Ma, Haizhou Li, Sylvain Meignier:
Fantastic 4 system for NIST 2015 Language Recognition Evaluation. CoRR abs/1602.01929 (2016) - [i2]Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Spoofing detection under noisy conditions: a preliminary investigation and an initial database. CoRR abs/1602.02950 (2016) - [i1]Zhaofeng Zhang, Xiong Xiao, Longbiao Wang, Eng Siong Chng, Haizhou Li:
Noise Robust Speech Recognition Using Multi-Channel Based Channel Selection And ChannelWeighting. CoRR abs/1604.03276 (2016) - 2015
- [j82]Chang Huai You, Haizhou Li, Kong-Aik Lee:
Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition. Comput. Speech Lang. 30(1): 116-134 (2015) - [j81]Liyuan Li, Qianli Xu, Gang S. Wang, Xinguo Yu, Yeow Kee Tan, Haizhou Li:
Visual Perception Based Engagement Awareness for Multiparty Human-Robot Interaction. Int. J. Humanoid Robotics 12(4): 1550019:1-1550019:28 (2015) - [j80]Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li:
Context-dependent Phone Mapping for Acoustic Modeling of Under-resourced Languages. Int. J. Asian Lang. Process. 23(1): 21-33 (2015) - [j79]Dau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li:
Mandarin-English code-switching speech corpus in South-East Asia: SEAME. Lang. Resour. Evaluation 49(3): 581-600 (2015) - [j78]Zhizheng Wu, Engsiong Chng, Haizhou Li:
Exemplar-based voice conversion using joint nonnegative matrix factorization. Multim. Tools Appl. 74(22): 9943-9958 (2015) - [j77]Zhizheng Wu, Nicholas W. D. Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, Haizhou Li:
Spoofing and countermeasures for speaker verification: A survey. Speech Commun. 66: 130-153 (2015) - [j76]Liping Chen, Kong-Aik Lee, Li-Rong Dai, Haizhou Li:
Quasi-Factorial Prior for i-vector Extraction. IEEE Signal Process. Lett. 22(12): 2484-2488 (2015) - [j75]Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Acoustic Segment Modeling with Spectral Clustering Methods. IEEE ACM Trans. Audio Speech Lang. Process. 23(2): 264-277 (2015) - [j74]Haizhou Li, Marcello Federico, Xiaodong He, Helen M. Meng, Isabel Trancoso:
Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing. IEEE ACM Trans. Audio Speech Lang. Process. 23(3): 427-430 (2015) - [j73]Rafael E. Banchs, Luis F. D'Haro, Haizhou Li:
Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework. IEEE ACM Trans. Audio Speech Lang. Process. 23(3): 472-482 (2015) - [j72]Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li:
Decoupling Word-Pair Distance and Co-occurrence Information for Effective Long History Context Language Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 23(7): 1221-1232 (2015) - [j71]Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Generalized Hough Transform for Speech Pattern Classification. IEEE ACM Trans. Audio Speech Lang. Process. 23(11): 1963-1972 (2015) - [c398]Luis Fernando D'Haro, Seokhwan Kim, Kheng Hui Yeo, Ridong Jiang, Andreea I. Niculescu, Rafael E. Banchs, Haizhou Li:
CLARA: A Multifunctional Virtual Agent for Conference Support and Touristic Information. IWSDS 2015: 233-239 - [c397]Miaolong Yuan, Bo Tian, Vui Ann Shim, Huajin Tang, Haizhou Li:
An Entorhinal-Hippocampal Model for Simultaneous Cognitive Map Building. AAAI 2015: 586-592 - [c396]Huaiping Ming, Dong-Yan Huang, Minghui Dong, Haizhou Li, Lei Xie, Shaofei Zhang:
Fundamental frequency modeling using wavelets for emotional voice conversion. ACII 2015: 804-809 - [c395]Min Zhang, Haizhou Li, Rafael E. Banchs, A. Kumaran:
Whitepaper of NEWS 2015 Shared Task on Machine Transliteration. NEWS@ACL 2015: 1-9 - [c394]Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li, A. Kumaran:
Report of NEWS 2015 Machine Transliteration Shared Task. NEWS@ACL 2015: 10-23 - [c393]Van Hai Do, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Distance metric learning for kernel density-based acoustic model under limited training data conditions. APSIPA 2015: 54-58 - [c392]Jia Yu, Lei Xie, Xiong Xiao, Eng Siong Chng, Haizhou Li:
A density peak clustering approach to unsupervised acoustic subword units discovery. APSIPA 2015: 178-183 - [c391]Shaofei Zhang, Dong-Yan Huang, Lei Xie, Eng Siong Chng, Haizhou Li, Minghui Dong:
Non-negative matrix factorization using stable alternating direction method of multipliers for source separation. APSIPA 2015: 222-228 - [c390]Van Tung Pham, Haihua Xu, Van Hai Do, Tze Yuang Chong, Xiong Xiao, Eng Siong Chng, Haizhou Li:
On the study of very low-resource language keyword search. APSIPA 2015: 358-364 - [c389]Minghui Dong, Chenyu Yang, Yanfeng Lu, Jochen Walter Ehnes, Dong-Yan Huang, Huaiping Ming, Rong Tong, Siu Wa Lee, Haizhou Li:
Mapping frames with DNN-HMM recognizer for non-parallel voice conversion. APSIPA 2015: 488-494 - [c388]Van Hai Do, Xiong Xiao, Haihua Xu, Eng Siong Chng, Haizhou Li:
Multilingual exemplar-based acoustic model for the NIST Open KWS 2015 evaluation. APSIPA 2015: 594-98 - [c387]Shengkui Zhao, Xiong Xiao, Zhaofeng Zhang, Thi Ngoc Tho Nguyen, Xionghu Zhong, Bo Ren, Longbiao Wang, Douglas L. Jones, Engsiong Chng, Haizhou Li:
Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. ASRU 2015: 460-467 - [c386]Haihua Xu, Xiong Xiao, Engsiong Chng, Haizhou Li:
On statistical machine translation method for lexicon refinement in speech recognition. ChinaSIP 2015: 25-29 - [c385]Xiaohai Tian, Steven Du, Xiong Xiao, Haihua Xu, Engsiong Chng, Haizhou Li:
Detecting synthetic speech using long term magnitude and phase information. ChinaSIP 2015: 611-615 - [c384]Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constraints from Wikipedia. EMNLP 2015: 2225-2229 - [c383]Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw, Haizhou Li:
Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data. IALP 2015: 41-44 - [c382]Gillian Chua, Qian Ci Chang, Ye Won Park, Paul Yaozhu Chan, Minghui Dong, Haizhou Li:
The expression of singing emotion - contradicting the constraints of song. IALP 2015: 98-102 - [c381]Yang Yu, Weisi Lin, Dong-Yan Huang, Minghui Dong, Haizhou Li:
Performance scoring of singing voice. IALP 2015: 119-122 - [c380]Ridong Jiang, Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
Towards improving the performance of Vector Space Model for Chinese Frequently Asked Question Answering. IALP 2015: 136-139 - [c379]Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Combining robust spike coding with spiking neural networks for sound event classification. ICASSP 2015: 176-180 - [c378]Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L. Jones, Engsiong Chng, Haizhou Li:
A learning-based approach to direction of arrival estimation in noisy and reverberant environments. ICASSP 2015: 2814-2818 - [c377]Sven Ewan Shepstone, Kong-Aik Lee, Haizhou Li, Zheng-Hua Tan, Søren Holdt Jensen:
Source-specific informative prior for i-vector extraction. ICASSP 2015: 4185-4189 - [c376]Haihua Xu, Peng Yang, Xiong Xiao, Lei Xie, Cheung-Chi Leung, Hongjie Chen, Jia Yu, Hang Lv, Lei Wang, Su Jun Leow, Bin Ma, Engsiong Chng, Haizhou Li:
Language independent query-by-example spoken term detection using N-best phone sequences and partial matching. ICASSP 2015: 5191-5195 - [c375]Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Channel adaptation of plda for text-independent speaker verification. ICASSP 2015: 5251-5255 - [c374]Rong Tong, Nancy F. Chen, Boon Pang Lim, Bin Ma, Haizhou Li:
Tokenizing fundamental frequency variation for Mandarin tone error detection. ICASSP 2015: 5361-5365 - [c373]Nancy F. Chen, Chongjia Ni, I-Fan Chen, Sunil Sivadas, Van Tung Pham, Haihua Xu, Xiong Xiao, Tze Siong Lau, Su Jun Leow, Boon Pang Lim, Cheung-Chi Leung, Lei Wang, Chin-Hui Lee, Alvina Goh, Engsiong Chng, Bin Ma, Haizhou Li:
Low-resource keyword search strategies for tamil. ICASSP 2015: 5366-5370 - [c372]Paul Yaozhu Chan, Minghui Dong, Yi Qian Lim, Ashleigh Toh, Elliot Chong, Mantita Yeo, Megan Chua, Haizhou Li:
Formant excursion in singing synthesis. DSP 2015: 168-172 - [c371]Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Phone-centric local variability vector for text-constrained speaker verification. INTERSPEECH 2015: 229-233 - [c370]Nancy F. Chen, Rong Tong, Darren Wee, Pei Xuan Lee, Bin Ma, Haizhou Li:
iCALL corpus: Mandarin Chinese spoken by non-native speakers of European descent. INTERSPEECH 2015: 324-328 - [c369]Rong Tong, Nancy F. Chen, Bin Ma, Haizhou Li:
Goodness of tone (GOT) for non-native Mandarin tone recognition. INTERSPEECH 2015: 801-805 - [c368]Saad Irtza, Vidhyasaharan Sethu, Phu Ngoc Le, Eliathamby Ambikairajah, Haizhou Li:
Phonemes frequency based PLLR dimensionality reduction for language recognition. INTERSPEECH 2015: 997-1001 - [c367]Longting Xu, Kong-Aik Lee, Haizhou Li, Zhen Yang:
Sparse coding of total variability matrix. INTERSPEECH 2015: 1022-1026 - [c366]Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li:
TDTO language modeling with feedforward neural networks. INTERSPEECH 2015: 1458-1462 - [c365]Shaofei Zhang, Dong-Yan Huang, Lei Xie, Engsiong Chng, Haizhou Li, Minghui Dong:
Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation. INTERSPEECH 2015: 1498-1502 - [c364]Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Spiking neural networks and the generalised hough transform for speech pattern detection. INTERSPEECH 2015: 1997-2001 - [c363]Xiong Xiao, Xiaohai Tian, Steven Du, Haihua Xu, Engsiong Chng, Haizhou Li:
Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge. INTERSPEECH 2015: 2052-2056 - [c362]Kong-Aik Lee, Guangsen Wang, Kam Pheng Ng, Hanwu Sun, Trung Hieu Nguyen, Ngoc Thuy Huong Thai, Bin Ma, Haizhou Li:
The reddots platform for mobile crowd-sourcing of speech data. INTERSPEECH 2015: 2603-2604 - [c361]Dong-Yan Huang, Minghui Dong, Haizhou Li:
A real-time variable-q non-stationary Gabor transform for pitch shifting. INTERSPEECH 2015: 2744-2748 - [c360]Kong-Aik Lee, Anthony Larcher, Guangsen Wang, Patrick Kenny, Niko Brümmer, David A. van Leeuwen, Hagai Aronowitz, Marcel Kockmann, Carlos Vaquero, Bin Ma, Haizhou Li, Themos Stafylakis, Md. Jahangir Alam, Albert Swart, Javier Perez:
The reddots data collection for speaker recognition. INTERSPEECH 2015: 2996-3000 - [c359]Hongjie Chen, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study. INTERSPEECH 2015: 3189-3193 - [c358]Huaiping Ming, Dong-Yan Huang, Lei Xie, Haizhou Li, Minghui Dong:
An alternating optimization approach for phase retrieval. INTERSPEECH 2015: 3426-3430 - [c357]Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L. Jones, Engsiong Chng, Haizhou Li:
Learning to estimate reverberation time in noisy and reverberant rooms. INTERSPEECH 2015: 3431-3435 - [c356]Hoang Gia Ngo, Nancy F. Chen, Binh Minh Nguyen, Bin Ma, Haizhou Li:
Phonology-augmented statistical transliteration for low-resource languages. INTERSPEECH 2015: 3670-3674 - [c355]Jingyong Hou, Van Tung Pham, Cheung-Chi Leung, Lei Wang, Haihua Xu, Hang Lv, Lei Xie, Zhonghua Fu, Chongjia Ni, Xiong Xiao, Hongjie Chen, Shaofei Zhang, Sining Sun, Yougen Yuan, Pengcheng Li, Tin Lay Nwe, Sunil Sivadas, Bin Ma, Engsiong Chng, Haizhou Li:
The NNI Query-by-Example System for MediaEval 2015. MediaEval 2015 - [c354]Sheng Gao, Haizhou Li:
Octave-dependent Probabilistic Latent Semantic Analysis to Chorus Detection of Popular Song. ACM Multimedia 2015: 979-982 - [c353]Sheng Gao, Haizhou Li:
Popular song summarization using chorus section detection from audio signal. MMSP 2015: 1-6 - [c352]Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
Towards Improving Dialogue Topic Tracking Performances with Wikification of Concept Mentions. SIGDIAL Conference 2015: 124-128 - [p1]Linhong Zhu, Sheng Gao, Sinno Jialin Pan, Haizhou Li, Dingxiong Deng, Cyrus Shahabi:
The Pareto Principle Is Everywhere: Finding Informative Sentences for Opinion Summarization Through Leader Detection. Recommendation and Search in Social Networks 2015: 165-187 - [e12]Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, A. Kumaran:
Proceedings of the Fifth Named Entity Workshop, NEWS@ACL 2015, Beijing, China, July 31, 2015. Association for Computational Linguistics 2015, ISBN 978-1-941643-65-5 [contents] - 2014
- [j70]Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li:
Cross-Lingual Phone Mapping for Large Vocabulary Speech Recognition of Under-Resourced Languages. IEICE Trans. Inf. Syst. 97-D(2): 285-295 (2014) - [j69]Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li:
Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Commun. 60: 56-77 (2014) - [j68]Zhizheng Wu, Tuomas Virtanen, Engsiong Chng, Haizhou Li:
Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 22(10): 1506-1521 (2014) - [j67]Miaolong Yuan, Huajin Tang, Haizhou Li:
Real-Time Keypoint Recognition Using Restricted Boltzmann Machine. IEEE Trans. Neural Networks Learn. Syst. 25(11): 2119-2126 (2014) - [c351]Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain Knowledge from Wikipedia. ACL (2) 2014: 19-23 - [c350]Dong-Yan Huang, Haizhou Li, Minghui Dong:
Ensemble Nyström method for predicting conflict level from speech. APSIPA 2014: 1-5 - [c349]Guangpu Huang, Chenglin Xu, Xiong Xiao, Lei Xie, Chng Eng Siong, Haizhou Li:
Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news. APSIPA 2014: 1-9 - [c348]Shuojun Liu, Dong-Yan Huang, Weisi Lin, Minghui Dong, Haizhou Li, Ee Ping Ong:
Emotional facial expression transfer based on temporal restricted Boltzmann machines. APSIPA 2014: 1-7 - [c347]Zhizheng Wu, Sheng Gao, Engsiong Chng, Haizhou Li:
A study on replay attack and anti-spoofing for text-dependent speaker verification. APSIPA 2014: 1-5 - [c346]Haihua Xu, Van Tung Pham, Engsiong Chng, Haizhou Li:
Towards better keyword search performance on Malay broadcast news data. APSIPA 2014: 1-5 - [c345]Huaiping Ming, Dong-Yan Huang, Lei Xie, Haizhou Li:
Learning optimal features for music transcription. ChinaSIP 2014: 105-109 - [c344]Seokhwan Kim, Rafael E. Banchs, Haizhou Li:
Wikipedia-based Kernels for dialogue topic tracking. ICASSP 2014: 131-135 - [c343]Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li:
Modelling the alternative hypothesis for text-dependent speaker verification. ICASSP 2014: 734-738 - [c342]Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li:
Imposture classification for text-dependent speaker verification. ICASSP 2014: 739-743 - [c341]Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li:
Feature compensation using linear combination of speaker and environment dependent correction vectors. ICASSP 2014: 1720-1724 - [c340]Duc Hoang Ha Nguyen, Xiong Xiao, Engsiong Chng, Haizhou Li:
Generalization of temporal filter and linear transformation for robust speech recognition. ICASSP 2014: 1730-1734 - [c339]Jonathan William Dennis, Tran Huy Dat, Haizhou Li, Engsiong Chng:
A discriminatively trained Hough Transform for frame-level phoneme recognition. ICASSP 2014: 2514-2518 - [c338]Dong-Yan Huang, Minghui Dong, Haizhou Li:
Intelligibility detection of pathological speech using asymmetric sparse kernel partial least squares classifier. ICASSP 2014: 3744-3748 - [c337]Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Minimum divergence estimation of speaker prior in multi-session PLDA scoring. ICASSP 2014: 4007-4011 - [c336]Nancy F. Chen, Sunil Sivadas, Boon Pang Lim, Hoang Gia Ngo, Haihua Xu, Van Tung Pham, Bin Ma, Haizhou Li:
Strategies for Vietnamese keyword search. ICASSP 2014: 4121-4125 - [c335]Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li:
Improving language modeling by using distance and co-occurrence information of word-pairs and its application to LVCSR. ICASSP 2014: 4883-4887 - [c334]Rong Tong, Boon Pang Lim, Nancy F. Chen, Bin Ma, Haizhou Li:
Subspace Gaussian mixture model for computer-assisted language learning. ICASSP 2014: 5347-5351 - [c333]Van Tung Pham, Haihua Xu, Nancy F. Chen, Sunil Sivadas, Boon Pang Lim, Engsiong Chng, Haizhou Li:
Discriminative score normalization for keyword search decision. ICASSP 2014: 7078-7082 - [c332]Van Hai Do, Xiong Xiao, Chng Eng Siong, Haizhou Li:
Kernel density-based acoustic model with cross-lingual bottleneck features for resource limited LVCSR. INTERSPEECH 2014: 6-10 - [c331]Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
A graph-based Gaussian component clustering approach to unsupervised acoustic modeling. INTERSPEECH 2014: 875-879 - [c330]Anthony Larcher, Kong-Aik Lee, Pablo Luis Sordo Martinez, Trung Hieu Nguyen, Bin Ma, Haizhou Li:
Extended RSR2015 for text-dependent speaker verification over VHF channel. INTERSPEECH 2014: 1322-1326 - [c329]Hoang Gia Ngo, Nancy F. Chen, Sunil Sivadas, Bin Ma, Haizhou Li:
A minimal-resource transliteration framework for vietnamese. INTERSPEECH 2014: 1410-1414 - [c328]Peng Yang, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection. INTERSPEECH 2014: 1722-1726 - [c327]Haihua Xu, Hang Su, Chng Eng Siong, Haizhou Li:
Semi-supervised training for bottle-neck feature based DNN-HMM hybrid systems. INTERSPEECH 2014: 2078-2082 - [c326]Minghui Dong, Siu Wa Lee, Haizhou Li, Paul Y. Chan, Xuejian Peng, Jochen Walter Ehnes, Dong-Yan Huang:
I2r speech2singing perfects everyone's singing. INTERSPEECH 2014: 2148-2149 - [c325]Siu Wa Lee, Zhizheng Wu, Minghui Dong, Xiaohai Tian, Haizhou Li:
A comparative study of spectral transformation techniques for singing voice synthesis. INTERSPEECH 2014: 2499-2503 - [c324]Zhizheng Wu, Chng Eng Siong, Haizhou Li:
Joint nonnegative matrix factorization for exemplar-based voice conversion. INTERSPEECH 2014: 2509-2513 - [c323]Chenglin Xu, Lei Xie, Guangpu Huang, Xiong Xiao, Engsiong Chng, Haizhou Li:
A deep neural network approach for sentence boundary detection in broadcast news. INTERSPEECH 2014: 2887-2891 - [c322]Rong Tong, Bin Ma, Haizhou Li:
Virtual example for phonotactic language recognition. INTERSPEECH 2014: 3017-3021 - [c321]Vui Ann Shim, Bo Tian, Miaolong Yuan, Huajin Tang, Haizhou Li:
Direction-driven navigation using cognitive map for mobile robots. IROS 2014: 2639-2646 - [c320]Liping Chen, Kong-Aik Lee, Bin Ma, Wu Guo, Haizhou Li, Li-Rong Dai:
Local variability vector for text-independent speaker verification. ISCSLP 2014: 54-58 - [c319]Yuma Ueda, Longbiao Wang, Atsuhiko Kai, Xiong Xiao, Engsiong Chng, Haizhou Li:
Single-channel dereverberation for distant-talking speech recognition by combining denoising autoencoder and temporal structure normalization. ISCSLP 2014: 379-383 - [c318]Kelvin Poon-Feng, Dong-Yan Huang, Minghui Dong, Haizhou Li:
Acoustic emotion recognition based on fusion of multiple feature-dependent deep Boltzmann machines. ISCSLP 2014: 584-588 - [c317]Peng Yang, Haihua Xu, Xiong Xiao, Lei Xie, Cheung-Chi Leung, Hongjie Chen, Jia Yu, Hang Lv, Lei Wang, Su Jun Leow, Bin Ma, Chng Eng Siong, Haizhou Li:
The NNI Query-by-Example System for MediaEval 2014. MediaEval 2014 - [c316]Kong Aik Lee, Bin Ma, Haizhou Li, Liping Chen, Wu Guo, Li-Rong Dai:
Local Variability Modeling for Text-Independent Speaker Verification. Odyssey 2014: 54-59 - [c315]Changhuai You, Kong Aik Lee, Bin Ma, Haizhou Li:
Text-Dependent Speaker Verification System in VHF Communication Channel. Odyssey 2014: 216-223 - [c314]Nicole Mirnig, Yeow Kee Tan, Tai Wen Chang, Yuanwei Chua, Tran Anh Dung, Haizhou Li, Manfred Tscheligi:
Screen feedback in human-robot interaction: How to enhance robot expressiveness. RO-MAN 2014: 224-230 - [c313]Van Tung Pham, Nancy F. Chen, Sunil Sivadas, Haihua Xu, I-Fan Chen, Chongjia Ni, Engsiong Chng, Haizhou Li:
System and keyword dependent fusion for spoken term detection. SLT 2014: 430-435 - [c312]Andreea I. Niculescu, Rafael E. Banchs, Haizhou Li:
Why Industrial Robots Should Become More Social - On the Design of a Natural Language Interface for an Interactive Robot Welder. ICSR 2014: 276-278 - [c311]Ville Hautamäki, Antti Pöllänen, Tomi Kinnunen, Kong-Aik Lee, Haizhou Li, Pasi Fränti:
A Comparison of Categorical Attribute Data Clustering Methods. S+SSPR 2014: 53-62 - [e11]Haizhou Li, Helen M. Meng, Bin Ma, Engsiong Chng, Lei Xie:
15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014, Singapore, September 14-18, 2014. ISCA 2014 [contents] - [e10]Minghui Dong, Jianhua Tao, Haizhou Li, Thomas Fang Zheng, Yanfeng Lu:
The 9th International Symposium on Chinese Spoken Language Processing, Singapore, September 12-14, 2014. IEEE 2014, ISBN 978-1-4799-4220-6 [contents] - 2013
- [j66]Sakriani Sakti, Michael Paul, Andrew M. Finch, Shinsuke Sakai, Thang Tat Vu, Noriyuki Kimura, Chiori Hori, Eiichiro Sumita, Satoshi Nakamura, Jun Park, Chai Wutiwiwatchai, Bo Xu, Hammam Riza, Karunesh Arora, Chi Mai Luong, Haizhou Li:
A-STAR: Toward translating Asian spoken languages. Comput. Speech Lang. 27(2): 509-527 (2013) - [j65]Jiali Yu, Huajin Tang, Haizhou Li, Luping Shi:
Dynamical properties of continuous attractor neural network with background tuning. Neurocomputing 99: 439-447 (2013) - [j64]Andreea I. Niculescu, Betsy van Dijk, Anton Nijholt, Haizhou Li, See Swee Lan:
Making Social Robots More Attractive: The Effects of Voice Pitch, Humor and Empathy. Int. J. Soc. Robotics 5(2): 171-191 (2013) - [j63]Jiali Yu, Huajin Tang, Haizhou Li:
Continuous attractors of discrete-time recurrent neural networks. Neural Comput. Appl. 23(1): 89-96 (2013) - [j62]Jun Hu, Huajin Tang, Kay Chen Tan, Haizhou Li, Luping Shi:
A Spike-Timing-Based Integrated Model for Pattern Recognition. Neural Comput. 25(2): 450-472 (2013) - [j61]Douglas D. O'Shaughnessy, Li Deng, Haizhou Li:
Speech Information Processing: Theory and Applications [Scanning the Issue]. Proc. IEEE 101(5): 1034-1037 (2013) - [j60]Haizhou Li, Bin Ma, Kong-Aik Lee:
Spoken Language Recognition: From Fundamentals to Practice. Proc. IEEE 101(5): 1136-1159 (2013) - [j59]Haipeng Wang, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li:
Shifted-Delta MLP Features for Spoken Language Recognition. IEEE Signal Process. Lett. 20(1): 15-18 (2013) - [j58]Ville Hautamäki, Tomi Kinnunen, Filip Sedlak, Kong-Aik Lee, Bin Ma, Haizhou Li:
Sparse Classifier Fusion for Speaker Verification. IEEE Trans. Speech Audio Process. 21(8): 1622-1631 (2013) - [j57]Raymond W. M. Ng, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Spoken Language Recognition With Prosodic Features. IEEE Trans. Speech Audio Process. 21(9): 1841-1853 (2013) - [j56]Stephen J. Wright, Dimitri Kanevsky, Li Deng, Xiaodong He, Georg Heigold, Haizhou Li:
Optimization Algorithms and Applications for Speech and Language Processing. IEEE Trans. Speech Audio Process. 21(11): 2231-2243 (2013) - [j55]Jiali Yu, Huajin Tang, Haizhou Li:
Dynamics Analysis of a Population Decoding Model. IEEE Trans. Neural Networks Learn. Syst. 24(3): 498-503 (2013) - [j54]Qiang Yu, Huajin Tang, Kay Chen Tan, Haizhou Li:
Rapid Feedforward Computation by Temporal Encoding and Learning With Spiking Neurons. IEEE Trans. Neural Networks Learn. Syst. 24(10): 1539-1552 (2013) - [c310]Xiaoming Lu, Lei Xie, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions. ACL (2) 2013: 190-195 - [c309]Tze Yuang Chong, Rafael E. Banchs, Engsiong Chng, Haizhou Li:
Modeling of term-distance and term-occurrence information for improving n-gram language model performance. ACL (2) 2013: 233-237 - [c308]Duc Hoang Ha Nguyen, Aleem Mushtaq, Xiong Xiao, Engsiong Chng, Haizhou Li, Chin-Hui Lee:
A particle filter compensation approach to robust LVCSR. APSIPA 2013: 1-7 - [c307]Zhizheng Wu, Haizhou Li:
Voice conversion and spoofing attack on speaker verification systems. APSIPA 2013: 1-9 - [c306]Linhong Zhu, Sheng Gao, Sinno Jialin Pan, Haizhou Li, Dingxiong Deng, Cyrus Shahabi:
Graph-based informative-sentence selection for opinion summarization. ASONAM 2013: 408-412 - [c305]Zhizheng Wu, Engsiong Chng, Haizhou Li:
Conditional restricted Boltzmann machine for voice conversion. ChinaSIP 2013: 104-108 - [c304]Dau-Cheng Lyu, Engsiong Chng, Haizhou Li:
Language diarization for conversational code-switch speech with pronunciation dictionary adaptation. ChinaSIP 2013: 147-150 - [c303]Xiong Xiao, Engsiong Chng, Haizhou Li:
Constrained adaptation of histogram equalization for robust speech recognition. ChinaSIP 2013: 360-364 - [c302]Jennifer Williams, Rafael Enrique Banchs, Haizhou Li:
Meaning Unit Segmentation in English and Chinese: a New Approach to Discourse Phenomena. DiscoMT@ACL 2013: 1-9 - [c301]Jonathan William Dennis, Qiang Yu, Huajin Tang, Tran Huy Dat, Haizhou Li:
Temporal coding of local spectrogram features for robust sound recognition. ICASSP 2013: 803-807 - [c300]Zhizheng Wu, Xiong Xiao, Engsiong Chng, Haizhou Li:
Synthetic speech detection using temporal modulation feature. ICASSP 2013: 7234-7238 - [c299]Dau-Cheng Lyu, Engsiong Chng, Haizhou Li:
Language diarization for code-switch conversational speech. ICASSP 2013: 7314-7318 - [c298]Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li:
Phonetically-constrained PLDA modeling for text-dependent speaker verification with multiple short utterances. ICASSP 2013: 7673-7677 - [c297]Chang Huai You, Haizhou Li, Bin Ma, Kong-Aik Lee:
A study on GMM-SVM with adaptive relevance factor and its comparison with i-vector and JFA for speaker recognition. ICASSP 2013: 7683-7687 - [c296]Xiong Xiao, Engsiong Chng, Haizhou Li:
Temporal filter design by minimum KL divergence criterion for robust speech recognition. ICASSP 2013: 7908-7912 - [c295]Nancy F. Chen, Bin Ma, Haizhou Li:
Minimal-resource phonetic language models to summarize untranscribed speech. ICASSP 2013: 8357-8361 - [c294]Heike Adel, Ngoc Thang Vu, Franziska Kraus, Tim Schlippe, Haizhou Li, Tanja Schultz:
Recurrent neural network language modeling for code switching conversational speech. ICASSP 2013: 8411-8415 - [c293]Xiaoming Lu, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Broadcast news story segmentation using latent topics on data manifold. ICASSP 2013: 8465-8469 - [c292]Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection. ICASSP 2013: 8545-8549 - [c291]Dong-Yan Huang, Minghui Dong, Haizhou Li:
A dynamic Gaussian process for voice conversion. ICME Workshops 2013: 1-4 - [c290]Vidhyasaharan Sethu, Julien Epps, Eliathamby Ambikairajah, Haizhou Li:
GMM based speaker variability compensated system for interspeech 2013 compare emotion challenge. INTERSPEECH 2013: 205-209 - [c289]Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li:
Context-dependent phone mapping for LVCSR of under-resourced languages. INTERSPEECH 2013: 500-504 - [c288]Xiong Xiao, Engsiong Chng, Haizhou Li:
Attribute-based histogram equalization (HEQ) and its adaptation for robust speech recognition. INTERSPEECH 2013: 876-880 - [c287]Zhizheng Wu, Anthony Larcher, Kong-Aik Lee, Engsiong Chng, Tomi Kinnunen, Haizhou Li:
Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints. INTERSPEECH 2013: 950-954 - [c286]Rahim Saeidi, Kong-Aik Lee, Tomi Kinnunen, Tawfik Hasan, Benoit G. B. Fauve, Pierre-Michel Bousquet, Elie Khoury, Pablo Luis Sordo Martinez, Jia Min Karen Kua, Changhuai You, Hanwu Sun, Anthony Larcher, Padmanabhan Rajan, Ville Hautamäki, Cemal Hanilçi, Billy Braithwaite, Rosa González Hautamäki, Seyed Omid Sadjadi, Gang Liu, Hynek Boril, Navid Shokouhi, Driss Matrouf, Laurent El Shafey, Pejman Mowlaee, Julien Epps, Tharmarajah Thiruvaran, David A. van Leeuwen, Bin Ma, Haizhou Li, John H. L. Hansen, Jean-François Bonastre, Sébastien Marcel, John S. D. Mason, Eliathamby Ambikairajah:
I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification. INTERSPEECH 2013: 1986-1990 - [c285]Haipeng Wang, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Unsupervised mining of acoustic subword units with segment-level Gaussian posteriorgrams. INTERSPEECH 2013: 2297-2301 - [c284]Nancy F. Chen, Vivaek Shivakumar, Mahesh Harikumar, Bin Ma, Haizhou Li:
Large-scale characterization of Mandarin pronunciation errors made by native speakers of European languages. INTERSPEECH 2013: 2370-2374 - [c283]Anthony Larcher, Jean-François Bonastre, Benoit G. B. Fauve, Kong-Aik Lee, Christophe Lévy, Haizhou Li, John S. D. Mason, Jean-Yves Parfait:
ALIZE 3.0 - open source toolkit for state-of-the-art speaker recognition. INTERSPEECH 2013: 2768-2772 - [c282]Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Engsiong Chng, Haizhou Li:
Exemplar-based unit selection for voice conversion utilizing temporal information. INTERSPEECH 2013: 3057-3061 - [c281]Kong-Aik Lee, Anthony Larcher, Chang Huai You, Bin Ma, Haizhou Li:
Multi-session PLDA scoring of i-vector for partially open-set speaker detection. INTERSPEECH 2013: 3651-3655 - [c280]Bo Tian, Vui Ann Shim, Miaolong Yuan, Chithra Srinivasan, Huajin Tang, Haizhou Li:
RGB-D based cognitive map building and navigation. IROS 2013: 1562-1567 - [c279]Tze Yuang Chong, Xiong Xiao, Haihua Xu, Tien Ping Tan, Chau Khoa Pham, Dau-Cheng Lyu, Chng Eng Siong, Haizhou Li:
The development and analysis of a Malay broadcasr news corpus. O-COCOSDA/CASLRE 2013: 1-5 - [c278]Nicole Mirnig, Yeow Kee Tan, Boon Siew Han, Haizhou Li, Manfred Tscheligi:
Screen feedback: How to overcome the expressive limitations of a social robot. RO-MAN 2013: 348-349 - [c277]Yanan Li, Keng Peng Tee, Shuzhi Sam Ge, Haizhou Li:
Building Companionship through Human-Robot Collaboration. ICSR 2013: 1-7 - [c276]Zhizheng Wu, Tuomas Virtanen, Tomi Kinnunen, Eng Siong Chng, Haizhou Li:
Exemplar-based voice conversion using non-negative spectrogram deconvolution. SSW 2013: 201-206 - 2012
- [j53]Rui Yan, Keng Peng Tee, Yuanwei Chua, Haizhou Li, Huajin Tang:
Gesture Recognition Based on Localist Attractor Networks with Application to Robot Control [Application Notes]. IEEE Comput. Intell. Mag. 7(1): 64-74 (2012) - [j52]Haizhou Li:
Foreword. IEICE Trans. Inf. Syst. 95-D(5): 1181 (2012) - [j51]Xiaoxuan Wang, Lei Xie, Mimi Lu, Bin Ma, Engsiong Chng, Haizhou Li:
Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features. IEICE Trans. Inf. Syst. 95-D(5): 1206-1215 (2012) - [j50]Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li:
Selective Gammatone Envelope Feature for Robust Sound Event Recognition. IEICE Trans. Inf. Syst. 95-D(5): 1229-1237 (2012) - [j49]Keng Peng Tee, Rui Yan, Yuanwei Chua, Zhiyong Huang, Haizhou Li:
Modular IK: a Robust Inverse Kinematic Algorithm for Gesture Imitation in an Upper-Body Humanoid Robot. Int. J. Humanoid Robotics 9(2) (2012) - [j48]Jin-Shea Kuo, Haizhou Li:
Learning regional transliteration variants. Inf. Process. Manag. 48(1): 154-169 (2012) - [j47]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
Discriminative feature extraction for speech recognition using continuous output codes. Pattern Recognit. Lett. 33(13): 1703-1709 (2012) - [j46]Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li:
Mixture of Factor Analyzers Using Priors From Non-Parallel Speech for Voice Conversion. IEEE Signal Process. Lett. 19(12): 914-917 (2012) - [j45]Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li:
Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data. IEEE Trans. Speech Audio Process. 20(2): 461-473 (2012) - [j44]Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li:
Bitext Dependency Parsing With Auto-Generated Bilingual Treebank. IEEE Trans. Speech Audio Process. 20(5): 1461-1472 (2012) - [j43]Tomi Kinnunen, Rahim Saeidi, Filip Sedlak, Kong-Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li:
Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification. IEEE Trans. Speech Audio Process. 20(7): 1990-2001 (2012) - [j42]Liyuan Li, Shuicheng Yan, Xinguo Yu, Yeow Kee Tan, Haizhou Li:
Robust Multiperson Detection and Tracking for Mobile Service and Social Robots. IEEE Trans. Syst. Man Cybern. Part B 42(5): 1398-1412 (2012) - [c275]Rafael E. Banchs, Haizhou Li:
IRIS: a Chat-oriented Dialogue System based on the Vector Space Model. ACL (System Demonstrations) 2012: 37-42 - [c274]Wenliang Chen, Min Zhang, Haizhou Li:
Utilizing Dependency Language Models for Graph-based Dependency Parsing Models. ACL (1) 2012: 213-222 - [c273]Deyi Xiong, Min Zhang, Haizhou Li:
Modeling the Translation of Predicate-Argument Structure for SMT. ACL (1) 2012: 902-911 - [c272]Min Zhang, Haizhou Li, A. Kumaran, Ming Liu:
Whitepaper of NEWS 2012 Shared Task on Machine Transliteration. NEWS@ACL 2012: 1-9 - [c271]Min Zhang, Haizhou Li, A. Kumaran, Ming Liu:
Report of NEWS 2012 Machine Transliteration Shared Task. NEWS@ACL 2012: 10-20 - [c270]Eliathamby Ambikairajah, Jia Min Karen Kua, Vidhyasaharan Sethu, Haizhou Li:
PNCC-ivector-SRC based speaker verification. APSIPA 2012: 1-7 - [c269]Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li, Eliathamby Ambikairajah:
A study on spoofing attack in state-of-the-art speaker verification: the telephone speech case. APSIPA 2012: 1-5 - [c268]Liyuan Li, Xinguo Yu, Jun Li, Gang S. Wang, Ji Yu Shi, Yeow Kee Tan, Haizhou Li:
Vision-based attention estimation and selection for social robot to perform natural interaction in the open world. HRI 2012: 183-184 - [c267]Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li:
A Phone Mapping Technique for Acoustic Modeling of Under-Resourced Languages. IALP 2012: 233-236 - [c266]Siu Wa Lee, Shen Ting Ang, Minghui Dong, Haizhou Li:
Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis. ICASSP 2012: 429-432 - [c265]Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li:
Lasso environment model combination for robust speech recognition. ICASSP 2012: 4305-4308 - [c264]Xiong Xiao, Engsiong Chng, Haizhou Li:
Joint spectral and temporal normalization of features for robust recognition of noisy and reverberated speech. ICASSP 2012: 4325-4328 - [c263]Tomi Kinnunen, Zhizheng Wu, Kong-Aik Lee, Filip Sedlak, Engsiong Chng, Haizhou Li:
Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech. ICASSP 2012: 4401-4404 - [c262]Anthony Larcher, Pierre-Michel Bousquet, Kong-Aik Lee, Driss Matrouf, Haizhou Li, Jean-François Bonastre:
I-vectors in the context of phonetically-constrained short utterances for speaker verification. ICASSP 2012: 4773-4776 - [c261]Ngoc Thang Vu, Dau-Cheng Lyu, Jochen Weiner, Dominic Telaar, Tim Schlippe, Fabian Blaicher, Engsiong Chng, Tanja Schultz, Haizhou Li:
A first speech recognition system for Mandarin-English code-switch conversational speech. ICASSP 2012: 4889-4892 - [c260]Teruhisa Misu, Etsuo Mizukami, Hideki Kashioka, Satoshi Nakamura, Haizhou Li:
A bootstrapping approach for SLU portability to a new language by inducting unannotated user queries. ICASSP 2012: 4961-4964 - [c259]Lilei Zheng, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Acoustic TextTiling for story segmentation of spoken documents. ICASSP 2012: 5121-5124 - [c258]Haipeng Wang, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li:
An acoustic segment modeling approach to query-by-example spoken term detection. ICASSP 2012: 5157-5160 - [c257]Anthony Larcher, Kong-Aik Lee, Bin Ma, Haizhou Li:
RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases. INTERSPEECH 2012: 1580-1583 - [c256]Ye Jiang, Kong-Aik Lee, Zhenmin Tang, Bin Ma, Anthony Larcher, Haizhou Li:
PLDA Modeling in I-Vector and Supervector Space for Speaker Verification. INTERSPEECH 2012: 1680-1683 - [c255]Zhizheng Wu, Chng Eng Siong, Haizhou Li:
Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition. INTERSPEECH 2012: 1700-1703 - [c254]Changhuai You, Haizhou Li, Bin Ma, Kong-Aik Lee:
Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition. INTERSPEECH 2012: 2065-2068 - [c253]Keng Peng Tee, Shuzhi Sam Ge, Rui Yan, Haizhou Li:
Adaptive control for robot manipulators under ellipsoidal task space constraints. IROS 2012: 1167-1172 - [c252]Van Hai Do, Xiong Xiao, Engsiong Chng, Haizhou Li:
Context dependant phone mapping for cross-lingual acoustic modeling. ISCSLP 2012: 16-20 - [c251]Cheung-Chi Leung, Bin Ma, Haizhou Li:
Phonotactic spoken language recognition: Using diversely adapted acoustic models in parallel phone recognizers. ISCSLP 2012: 108-111 - [c250]Duc Hoang Ha Nguyen, Xiong Xiao, Chng Eng Siong, Haizhou Li:
An analysis of vector Taylor series model compensation for non-stationary noise in speech recognition. ISCSLP 2012: 131-135 - [c249]Siu Wa Lee, Minghui Dong, Haizhou Li:
A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis. ISCSLP 2012: 150-154 - [c248]Teruhisa Misu, Shigeki Matsuda, Etsuo Mizukami, Hideki Kashioka, Haizhou Li:
Efficient Language Model Construction for Spoken Dialog Systems by Inducting Language Resources of Different Languages. IWSDS 2012: 101-110 - [c247]Ridong Jiang, Yeow Kee Tan, Dilip Kumar Limbu, Tran Anh Dung, Haizhou Li:
Component Pluggable Dialogue Framework and Its Application to Social Robots. IWSDS 2012: 225-237 - [c246]Ville Hautamäki, Kong-Aik Lee, Anthony Larcher, Tomi Kinnunen, Bin Ma, Haizhou Li:
Variational Bayes logistic regression as regularized fusion for NIST SRE 2010. Odyssey 2012: 268-274 - [c245]Chang Huai You, Haizhou Li, Eliathamby Ambikairajah, Kong-Aik Lee, Bin Ma:
Bhattacharyya-based GMM-SVM system with adaptive relevance factor for pair language recognition. Odyssey 2012: 338-345 - [c244]Jochen Weiner, Ngoc Thang Vu, Dominic Telaar, Florian Metze, Tanja Schultz, Dau-Cheng Lyu, Engsiong Chng, Haizhou Li:
Integration of language identification into a recognition system for spoken conversations containing code-Switches. SLTU 2012: 76-79 - [e9]Min Zhang, Haizhou Li, A. Kumaran:
Proceedings of the 4th Named Entity Workshop, NEWS@ACL 2012, Jeju, Korea, July 12, 2012. Association for Computational Linguistics 2012, ISBN 978-1-937284-40-4 [contents] - [e8]Haizhou Li, Bin Ma, Kong-Aik Lee:
Odyssey 2012: The Speaker and Language Recognition Workshop, Singapore, June 25-28, 2012. ISCA 2012 [contents] - [e7]Qunsheng Peng, Haizhou Li:
SIGGRAPH Asia 2012 Poster Proceedings, Singapore, Singapore, November 28 - December 01, 2012. ACM 2012, ISBN 978-1-4503-1911-9 [contents] - 2011
- [j41]Huajin Tang, Haizhou Li:
Information Theoretic Learning: Reny's Entropy and Kernel Perspectives (Principe, J.; 2010) [Book Review]. IEEE Comput. Intell. Mag. 6(3): 60-62 (2011) - [j40]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
Error Corrective Fusion of Classifier Scores for Spoken Language Recognition. IEICE Trans. Inf. Syst. 94-D(12): 2503-2512 (2011) - [j39]Haizhou Li, John-John Cabibihan, Yeow Kee Tan:
Towards an Effective Design of Social Robots. Int. J. Soc. Robotics 3(4): 333-335 (2011) - [j38]Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Spectrogram Image Feature for Sound Event Classification in Mismatched Conditions. IEEE Signal Process. Lett. 18(2): 130-133 (2011) - [j37]Donglai Zhu, Bin Ma, Haizhou Li:
Speaker Verification With Feature-Space MAPLR Parameters. IEEE Trans. Speech Audio Process. 19(3): 505-515 (2011) - [j36]Kong Aik Lee, Chang Huai You, Haizhou Li, Tomi Kinnunen, Khe Chai Sim:
Using Discrete Probabilities With Bhattacharyya Measure for SVM-Based Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 19(4): 861-870 (2011) - [j35]H. D. Tran, Haizhou Li:
Sound Event Recognition With Probabilistic Distance SVMs. IEEE Trans. Speech Audio Process. 19(6): 1556-1568 (2011) - [j34]Deyi Xiong, Min Zhang, Haizhou Li:
A Maximum-Entropy Segmentation Model for Statistical Machine Translation. IEEE ACM Trans. Audio Speech Lang. Process. 19(8): 2494-2505 (2011) - [j33]Namunu Chinthaka Maddage, Haizhou Li:
Beat space segmentation and octave scale cepstral feature for sung language recognition in pop music. ACM Trans. Multim. Comput. Commun. Appl. 7(4): 37:1-37:19 (2011) - [c243]Rafael E. Banchs, Haizhou Li:
AM-FM: A Semantic Framework for Translation Quality Assessment. ACL (2) 2011: 153-158 - [c242]Deyi Xiong, Min Zhang, Haizhou Li:
Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers. ACL 2011: 1288-1297 - [c241]Min Zhang, Haizhou Li, A. Kumaran, Ming Liu:
Report of NEWS 2011 Machine Transliteration Shared Task. NEWS@IJCNLP 2011: 1-13 - [c240]Min Zhang, A. Kumaran, Haizhou Li:
Whitepaper of NEWS 2011 Shared Task on Machine Transliteration. NEWS@IJCNLP 2011: 14-22 - [c239]Sheng Gao, Haizhou Li:
A cross-domain adaptation method for sentiment classification using probabilistic latent analysis. CIKM 2011: 1047-1052 - [c238]Wenliang Chen, Jun'ichi Kazama, Min Zhang, Yoshimasa Tsuruoka, Yujie Zhang, Yiou Wang, Kentaro Torisawa, Haizhou Li:
SMT Helps Bitext Dependency Parsing. EMNLP 2011: 73-83 - [c237]Zhenghua Li, Min Zhang, Wanxiang Che, Ting Liu, Wenliang Chen, Haizhou Li:
Joint Models for Chinese POS Tagging and Dependency Parsing. EMNLP 2011: 1180-1191 - [c236]Tran Huy Dat, Haizhou Li:
Probabilistic distance SVM with Hellinger-Exponential Kernel for sound event classification. ICASSP 2011: 2272-2275 - [c235]Tran Huy Dat, Haizhou Li:
Jump Function Kolmogorov for overlapping audio event classification. ICASSP 2011: 3696-3699 - [c234]Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li:
Score fusion and calibration in multiple language detectors with large performance variation. ICASSP 2011: 4404-4407 - [c233]Filip Sedlak, Tomi Kinnunen, Ville Hautamäki, Kong-Aik Lee, Haizhou Li:
Classifier subset selection and fusion for speaker verification. ICASSP 2011: 4544-4547 - [c232]Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai:
Factored covariance modeling for text-independent speaker verification. ICASSP 2011: 4856-4859 - [c231]Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li:
Maximum likelihood adaptation of histogram equalization with constraint for robust speech recognition. ICASSP 2011: 5480-5483 - [c230]Guoyu Tang, Yunqing Xia, Min Zhang, Haizhou Li, Fang Zheng:
CLGVSM: Adapting Generalized Vector Space Model to Cross-lingual Document Clustering. IJCNLP 2011: 580-588 - [c229]Min Zhang, Xiangyu Duan, Ming Liu, Yunqing Xia, Haizhou Li:
Joint Alignment and Artificial Data Generation: An Empirical Study of Pivot-based Machine Transliteration. IJCNLP 2011: 1207-1215 - [c228]Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li:
Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition. INTERSPEECH 2011: 297-300 - [c227]Xiong Xiao, Jinyu Li, Chng Eng Siong, Haizhou Li:
Feature Normalization Using Structured Full Transforms for Robust Speech Recognition. INTERSPEECH 2011: 693-696 - [c226]Chien-Lin Huang, Bin Ma, Haizhou Li, Chung-Hsien Wu:
Speech Indexing Using Semantic Context Inference. INTERSPEECH 2011: 717-720 - [c225]Rong Tong, Bin Ma, Haizhou Li, Chng Eng Siong:
Target-Aware Lattice Rescoring for Dialect Recognition. INTERSPEECH 2011: 733-736 - [c224]Mimi Lu, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:
Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation. INTERSPEECH 2011: 1109-1112 - [c223]Sethserey Sam, Xiong Xiao, Laurent Besacier, Eric Castelli, Haizhou Li, Chng Eng Siong:
Speech Modulation Features for Robust Nonnative Speech Accent Detection. INTERSPEECH 2011: 2417-2420 - [c222]Jonathan William Dennis, Tran Huy Dat, Haizhou Li:
Image Representation of the Subband Power Distribution for Robust Sound Classification. INTERSPEECH 2011: 2437-2440 - [c221]Ville Hautamäki, Kong-Aik Lee, Tomi Kinnunen, Bin Ma, Haizhou Li:
Regularized Logistic Regression Fusion for Speaker Verification. INTERSPEECH 2011: 2745-2748 - [c220]Chang Huai You, Haizhou Li, Kong-Aik Lee:
Study on the Relevance Factor of Maximum a Posteriori with GMM for Language Recognition. INTERSPEECH 2011: 2893-2896 - [c219]Kong-Aik Lee, Chang Huai You, Ville Hautamäki, Anthony Larcher, Haizhou Li:
Spoken Language Recognition in the Latent Topic Simplex. INTERSPEECH 2011: 2933-2936 - [c218]Kong-Aik Lee, Anthony Larcher, Helen Thai, Bin Ma, Haizhou Li:
Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home. INTERSPEECH 2011: 3317-3318 - [c217]Sheng Gao, Haizhou Li:
Effective Large Scale Text Retrieval via Learning Risk-Minimization and Dependency-Embedded Model. MMM (2) 2011: 99-110 - [e6]Min Zhang, Haizhou Li, A. Kumaran:
Proceedings of the 3rd Named Entities Workshop, NEWS@IJCNLP 2011, Chiang Mai, Thailand, November 12, 2011. Asian Federation of Natural Language Processing 2011 [contents] - 2010
- [j32]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
Linguistically Annotated Reordering: Evaluation and Analysis. Comput. Linguistics 36(3): 535-568 (2010) - [j31]Ling Cen, Minghui Dong, Paul Y. Chan, Haizhou Li:
Feature Integration and Dimension Reduction in Unit Selection TTS. Int. J. Asian Lang. Process. 20(1): 35-42 (2010) - [j30]Huajin Tang, Haizhou Li, Rui Yan:
Memory Dynamics in Attractor Networks with Saliency Weights. Neural Comput. 22(7): 1899-1926 (2010) - [j29]Lei Wang, Engsiong Chng, Haizhou Li:
A tree-construction search approach for multivariate time series motifs discovery. Pattern Recognit. Lett. 31(9): 869-875 (2010) - [j28]Tomi Kinnunen, Haizhou Li:
An overview of text-independent speaker recognition: From features to supervectors. Speech Commun. 52(1): 12-40 (2010) - [j27]Haizhou Li, Bin Ma:
TechWare: Speaker and Spoken Language Recognition Resources [Best of the Web]. IEEE Signal Process. Mag. 27(6): 139-142 (2010) - [j26]Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li, Chin-Hui Lee:
A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition. IEEE Trans. Speech Audio Process. 18(6): 1158-1169 (2010) - [j25]Chang Huai You, Kong-Aik Lee, Haizhou Li:
GMM-SVM Kernel With a Bhattacharyya-Based Distance for Speaker Recognition. IEEE Trans. Speech Audio Process. 18(6): 1300-1312 (2010) - [j24]Huajin Tang, Haizhou Li, Zhang Yi:
A discrete-time neural network for optimization problems with hybrid constraints. IEEE Trans. Neural Networks 21(7): 1184-1189 (2010) - [j23]Tee Kiah Chia, Khe Chai Sim, Haizhou Li, Hwee Tou Ng:
Statistical lattice-based spoken document retrieval. ACM Trans. Inf. Syst. 28(1): 2:1-2:30 (2010) - [j22]Namunu Chinthaka Maddage, Khe Chai Sim, Haizhou Li:
Word level automatic alignment of music and lyrics using vocal synthesis. ACM Trans. Multim. Comput. Commun. Appl. 6(3): 19:1-19:16 (2010) - [c216]Xiangyu Duan, Min Zhang, Haizhou Li:
Pseudo-Word for Phrase-Based Machine Translation. ACL 2010: 148-156 - [c215]Deyi Xiong, Min Zhang, Haizhou Li:
Error Detection for Statistical Machine Translation Using Linguistic Features. ACL 2010: 604-611 - [c214]Min Zhang, Hui Zhang, Haizhou Li:
Convolution Kernel over Packed Parse Forest. ACL 2010: 875-885 - [c213]Haizhou Li, A. Kumaran, Min Zhang, Vladimir Pervouchine:
Report of NEWS 2010 Transliteration Generation Shared Task. NEWS@ACL 2010: 1-11 - [c212]Haizhou Li, A. Kumaran, Min Zhang, Vladimir Pervouchine:
Whitepaper of NEWS 2010 Shared Task on Transliteration Generation. NEWS@ACL 2010: 12-20 - [c211]A. Kumaran, Mitesh M. Khapra, Haizhou Li:
Report of NEWS 2010 Transliteration Mining Shared Task. NEWS@ACL 2010: 21-28 - [c210]A. Kumaran, Mitesh M. Khapra, Haizhou Li:
Whitepaper of NEWS 2010 Shared Task on Transliteration Mining. NEWS@ACL 2010: 29-38 - [c209]Minghui Dong, Paul Y. Chan, Ling Cen, Bin Ma, Haizhou Li:
I2R Text-to-Speech System for Blizzard Challenge 2010. Blizzard Challenge 2010 - [c208]Lianhau Lee, AiTi Aw, Min Zhang, Haizhou Li:
EM-based Hybrid Model for Bilingual Terminology Extraction from Comparable Corpora. COLING (Posters) 2010: 639-646 - [c207]Vladimir Pervouchine, Min Zhang, Ming Liu, Haizhou Li:
Improving Name Origin Recognition with Context Features and Unlabelled Data. COLING (Posters) 2010: 972-978 - [c206]Min Zhang, Xiangyu Duan, Vladimir Pervouchine, Haizhou Li:
Machine Transliteration: Leveraging on Third Languages. COLING (Posters) 2010: 1444-1452 - [c205]Andreea I. Niculescu, Betsy van Dijk, Anton Nijholt, See Swee Lan, Haizhou Li:
How humans behave and evaluate a social robot in real-environment settings. ECCE 2010: 351-352 - [c204]Hui Zhang, Min Zhang, Haizhou Li, Engsiong Chng:
Non-Isomorphic Forest Pair Translation. EMNLP 2010: 440-450 - [c203]Kong-Aik Lee, Haizhou Li, Chang Huai You, Tomi Kinnunen, Khe Chai Sim:
Discrete expected likelihood kernel for SVM-based speaker verification. EUSIPCO 2010: 591-595 - [c202]Chang Huai You, Haizhou Li, Kong-Aik Lee:
A GMM-supervector approach to language recognition with adaptive relevance factor. EUSIPCO 2010: 1993-1997 - [c201]Tran Huy Dat, Yi Ren Leng, Haizhou Li:
Feature integration for heart sound biometrics. ICASSP 2010: 1714-1717 - [c200]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
Error corrective classifier fusion for spoken Language Recognition. ICASSP 2010: 1994-1997 - [c199]Yu Tsao, Hanwu Sun, Haizhou Li, Chin-Hui Lee:
An acoustic segment model approach to incorporating temporal information into speaker modeling for text-independent speaker recognition. ICASSP 2010: 4422-4425 - [c198]Hanwu Sun, Bin Ma, Swe Zin Kalayar Khine, Haizhou Li:
Speaker diarization system for RT07 and RT09 meeting room audio. ICASSP 2010: 4982-4985 - [c197]Donglai Zhu, Bin Ma, Haizhou Li:
Soft margin estimation of Gaussian mixture model parameters for spoken language recognition. ICASSP 2010: 4990-4993 - [c196]C. Santhosh Kumar, Haizhou Li, Rong Tong, Pavel Matejka, Lukás Burget, Jan Cernocký:
Tuning phone decoders for language identification. ICASSP 2010: 5010-5013 - [c195]Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li:
Prosodic attribute model for spoken language identification. ICASSP 2010: 5022-5025 - [c194]Shuanhu Bai, Chien-Lin Huang, Bin Ma, Haizhou Li:
Semi-supervised learning of language model using unsupervised topic model. ICASSP 2010: 5386-5389 - [c193]Tin Lay Nwe, Minghui Dong, Paul Y. Chan, Xi Wang, Bin Ma, Haizhou Li:
Voice conversion: From spoken vowels to singing vowels. ICME 2010: 1421-1426 - [c192]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
Framewise Phone Classification Using Weighted Fuzzy Classification Rules. ICPR 2010: 4186-4189 - [c191]Keng Peng Tee, Rui Yan, Haizhou Li:
Adaptive admittance control of a robot manipulator under task space constraint. ICRA 2010: 5181-5186 - [c190]Hanwu Sun, Bin Ma, Chien-Lin Huang, Trung Hieu Nguyen, Haizhou Li:
The IIR NIST SRE 2008 and 2010 summed channel speaker recognition systems. INTERSPEECH 2010: 366-369 - [c189]Chien-Lin Huang, Hanwu Sun, Bin Ma, Haizhou Li:
Speaker characterization using long-term and temporal information. INTERSPEECH 2010: 370-373 - [c188]Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng:
Selecting phonotactic features for language recognition. INTERSPEECH 2010: 737-740 - [c187]Eryu Wang, Kong-Aik Lee, Bin Ma, Haizhou Li, Wu Guo, Li-Rong Dai:
The estimation and kernel metric of spectral correlation for text-independent speaker verification. INTERSPEECH 2010: 1065-1068 - [c186]Xiaoxuan Wang, Lei Xie, Bin Ma, Engsiong Chng, Haizhou Li:
Phoneme lattice based texttiling towards multilingual story segmentation. INTERSPEECH 2010: 1305-1308 - [c185]Donglai Zhu, Bin Ma, Kong-Aik Lee, Cheung-Chi Leung, Haizhou Li:
MAP estimation of subspace transform for speaker recognition. INTERSPEECH 2010: 1465-1468 - [c184]Ville Hautamäki, Tomi Kinnunen, Mohaddeseh Nosratighods, Kong-Aik Lee, Bin Ma, Haizhou Li:
Approaching human listener accuracy with modern speaker verification. INTERSPEECH 2010: 1473-1476 - [c183]Tin Lay Nwe, Hanwu Sun, Bin Ma, Haizhou Li:
Speaker diarization in meeting audio for single distant microphone. INTERSPEECH 2010: 1505-1508 - [c182]Zhizheng Wu, Tomi Kinnunen, Engsiong Chng, Haizhou Li:
Text-independent F0 transformation with non-parallel data for voice conversion. INTERSPEECH 2010: 1732-1735 - [c181]Raymond W. M. Ng, Cheung-Chi Leung, Ville Hautamäki, Tan Lee, Bin Ma, Haizhou Li:
Towards long-range prosodic attribute modeling for language recognition. INTERSPEECH 2010: 1792-1795 - [c180]Dau-Cheng Lyu, Tien Ping Tan, Engsiong Chng, Haizhou Li:
SEAME: a Mandarin-English code-switching speech corpus in south-east asia. INTERSPEECH 2010: 1986-1989 - [c179]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
A discriminative performance metric for GMM-UBM speaker identification. INTERSPEECH 2010: 2114-2117 - [c178]Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li:
Selective gammatone filterbank feature for robust sound event recognition. INTERSPEECH 2010: 2246-2249 - [c177]Cheung-Chi Leung, Donglai Zhu, Kong-Aik Lee, Bin Ma, Haizhou Li:
Incorporating MAP estimation and covariance transform for SVM based speaker recognition. INTERSPEECH 2010: 2318-2321 - [c176]Chang Huai You, Haizhou Li, Kong-Aik Lee:
A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor. INTERSPEECH 2010: 2746-2749 - [c175]Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li, Jason Teo, Ping Jen Kua:
Phonetic segmentation of singing voice using MIDI and parallel speech. INTERSPEECH 2010: 2890-2893 - [c174]Minghui Dong, Paul Y. Chan, Ling Cen, Haizhou Li:
Aligning singing voice with MIDI melody using synthesized audio signal. ISCSLP 2010: 95-98 - [c173]Chien-Lin Huang, Haizhou Li:
UBM data selection for effective speaker modeling. ISCSLP 2010: 162-165 - [c172]Eryu Wang, Wu Guo, Li-Rong Dai, Kong-Aik Lee, Bin Ma, Haizhou Li:
Factor analysis based spatial correlation modeling for speaker verification. ISCSLP 2010: 166-170 - [c171]Shuanhu Bai, Cheung-Chi Leung, Chien-Lin Huang, Bin Ma, Haizhou Li:
Building topic mixture language models using the document soft classification notion of topic models. ISCSLP 2010: 229-232 - [c170]Paul Yaozhu Chan, Minghui Dong, Ling Cen, Haizhou Li:
The psychoacoustic approach towards enhancing speech intelligibility in noise. ISCSLP 2010: 238-241 - [c169]Hanwu Sun, Bin Ma, Haizhou Li:
Frame selection of interview channel for NIST speaker recognition evaluation. ISCSLP 2010: 305-308 - [c168]Ling Cen, Paul Y. Chan, Minghui Dong, Haizhou Li:
Generating emotional speech from neutral speech. ISCSLP 2010: 383-386 - [c167]Xiangyu Duan, Rafael E. Banchs, Jun Lang, Deyi Xiong, AiTi Aw, Min Zhang, Haizhou Li:
I2r's machine translation system for IWSLT 2010. IWSLT 2010: 67-72 - [c166]Deyi Xiong, Min Zhang, Haizhou Li:
Learning Translation Boundaries for Phrase-Based Decoding. HLT-NAACL 2010: 136-144 - [c165]Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin Ma, Haizhou Li:
Detection target dependent score calibration for language recognition. Odyssey 2010: 18 - [c164]Cheung-Chi Leung, Bin Ma, Haizhou Li:
Parallel Acoustic Model Adaptation for Improving Phonotactic Language Recognition. Odyssey 2010: 41 - [c163]Boon Siew Han, Alvin Hong Yee Wong, Yeow Kee Tan, Haizhou Li:
Using design methodology to enhance interaction for a robotic receptionist. RO-MAN 2010: 797-802 - [c162]Haizhou Li:
BISTRA: Malay-English bidirectional speech translation. SLTU 2010: 1 - [c161]Sethserey Sam, Laurent Besacier, Eric Castelli, Bin Ma, Cheung-Chi Leung, Haizhou Li:
Autonomous acoustic model adaptation for multilingual meeting transcription involving high- and low-resourced languages. SLTU 2010: 116-121 - [c160]Rui Yan, Keng Peng Tee, Haizhou Li:
Nonlinear Control of a Robot Manipulator with Time-Varying Uncertainties. ICSR 2010: 202-211 - [c159]Minghui Dong, Ling Cen, Paul Y. Chan, Haizhou Li:
Considering readability in text-to-speech recording script design. SSW 2010: 312-316 - [e5]A. Kumaran, Haizhou Li:
Proceedings of the 2010 Named Entities Workshop, NEWS@ACL 2010, Uppsala, Sweden, July 16, 2010. Association for Computational Linguistics 2010, ISBN 978-1-932432-78-7 [contents] - [e4]Shuzhi Sam Ge, Haizhou Li, John-John Cabibihan, Yeow Kee Tan:
Social Robotics - Second International Conference on Social Robotics, ICSR 2010, Singapore, November 23-24, 2010. Proceedings. Lecture Notes in Computer Science 6414, Springer 2010, ISBN 978-3-642-17247-2 [contents]
2000 – 2009
- 2009
- [j21]Minghui Dong, Ling Cen, Paul Y. Chan, Haizhou Li:
Readability Consideration in Speech Synthesis Recording Script Selection. Int. J. Asian Lang. Process. 19(2): 45-54 (2009) - [j20]Chien-Lin Huang, Haizhou Li, Bin Ma:
Speaker Characterization using Average Filtering and Two Space Fusions. Int. J. Asian Lang. Process. 19(3): 85-94 (2009) - [j19]Raymond W. M. Ng, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Analysis and Selection of Prosodic Features for Asian Language Recognition. Int. J. Asian Lang. Process. 19(4): 139-152 (2009) - [j18]Chang Huai You, Kong-Aik Lee, Haizhou Li:
An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition. IEEE Signal Process. Lett. 16(1): 49-52 (2009) - [j17]Chung-Hsien Wu, Haizhou Li:
Introduction to the Special Issue on Recent Advances in Asian Language Spoken Document Retrieval. ACM Trans. Asian Lang. Inf. Process. 8(1): 1:1-1:3 (2009) - [j16]Rong Tong, Bin Ma, Haizhou Li, Chng Eng Siong:
A Target-Oriented Phonotactic Front-End for Spoken Language Recognition. IEEE Trans. Speech Audio Process. 17(7): 1335-1347 (2009) - [j15]Tran Huy Dat, Haizhou Li:
Jump function Kolmogorov for audio classification in noise-mismatch conditions. IEEE Trans. Signal Process. 57(8): 2908-2918 (2009) - [c158]Lianhau Lee, AiTi Aw, Thuy Vu, Sharifah Aljunied Mahani, Min Zhang, Haizhou Li:
MARS: Multilingual Access and Retrieval System with Enhanced Query Translation and Document Retrieval. ACL/IJCNLP (Software Demonstrations) 2009: 21-24 - [c157]Vladimir Pervouchine, Haizhou Li, Bo Lin:
Transliteration Alignment. ACL/IJCNLP 2009: 136-144 - [c156]Hui Zhang, Min Zhang, Haizhou Li, AiTi Aw, Chew Lim Tan:
Forest-based Tree Sequence to String Translation Model. ACL/IJCNLP 2009: 172-180 - [c155]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
A Syntax-Driven Bracketing Model for Phrase-Based Translation. ACL/IJCNLP 2009: 315-323 - [c154]Hendra Setiawan, Min-Yen Kan, Haizhou Li, Philip Resnik:
Topological Ordering of Function Words in Hierarchical Phrase-based Translation. ACL/IJCNLP 2009: 324-332 - [c153]Boxing Chen, Min Zhang, Haizhou Li, AiTi Aw:
A Comparative Study of Hypothesis Alignment and its Improvement for Machine Translation System Combination. ACL/IJCNLP 2009: 941-948 - [c152]Haizhou Li, A. Kumaran, Vladimir Pervouchine, Min Zhang:
Report of NEWS 2009 Machine Transliteration Shared Task. NEWS@IJCNLP 2009: 1-18 - [c151]Haizhou Li, A. Kumaran, Min Zhang, Vladimir Pervouchine:
Whitepaper of NEWS 2009 Machine Transliteration Shared Task. NEWS@IJCNLP 2009: 19-26 - [c150]Xiong Xiao, Jinyu Li, Engsiong Chng, Haizhou Li, Chin-Hui Lee:
A study on hidden Markov model's generalization capability for speech recognition. ASRU 2009: 255-260 - [c149]Sakriani Sakti, Noriyuki Kimura, Michael Paul, Chiori Hori, Eiichiro Sumita, Satoshi Nakamura, Jun Park, Chai Wutiwiwatchai, Bo Xu, Hammam Riza, Karunesh Arora, Chi Mai Luong, Haizhou Li:
The Asian network-based speech-to-speech translation system. ASRU 2009: 507-512 - [c148]Minghui Dong, Ling Cen, Paul Y. Chan, Dongyan Huang, Donglai Zhu, Bin Ma, Haizhou Li:
I2R Text-to-Speech System for Blizzard Challenge 2009. Blizzard Challenge 2009 - [c147]Rui Yan, Haizhou Li, Zhao Yang Dong, Huajin Tang:
Nonlinear control approaches for SI engine model with uncertainties. CDC 2009: 5440-5445 - [c146]Min Zhang, Haizhou Li:
Tree Kernel-based SVM with Structured Syntactic Knowledge for BTG-based Phrase Reordering. EMNLP 2009: 698-707 - [c145]Hui Zhang, Min Zhang, Haizhou Li, Chew Lim Tan:
Fast Translation Rule Matching for Syntax-based Statistical Machine Translation. EMNLP 2009: 1037-1045 - [c144]Hui Zhang, Min Zhang, Chew Lim Tan, Haizhou Li:
K-Best Combination of Syntactic Parsers. EMNLP 2009: 1552-1560 - [c143]Dilip Kumar Limbu, Yeow Kee Tan, Chern Yuen Wong, Ridong Jiang, Hengxin Wu, Liyuan Li, Kah Eng Hoe, Xinguo Yu, Li Dong, Haizhou Li:
Experiences with a Barista Robot, FusionBot. FIRA 2009: 140-151 - [c142]Yeow Kee Tan, Dilip Kumar Limbu, Ridong Jiang, Liyuan Li, Kah Eng Hoe, Xinguo Yu, Li Dong, Chern Yuen Wong, Haizhou Li:
An Interactive Robot Butler. HCI (2) 2009: 385-394 - [c141]Raymond W. M. Ng, Tan Lee, Cheung-Chi Leung, Bin Ma, Haizhou Li:
Analysis and Selection of Prosodic Features for Language Identification. IALP 2009: 123-128 - [c140]Minghui Dong, Ling Cen, Paul Y. Chan, Haizhou Li:
Refining Unit Boundaries for Mandarin Text-to-Speech Database. IALP 2009: 245-248 - [c139]Shuanhu Bai, Min Zhang, Haizhou Li:
Semi-supervised Learning of Domain-Specific Language Models from General Domain Data. IALP 2009: 273-279 - [c138]Cheung-Chi Leung, Rong Tong, Bin Ma, Haizhou Li:
A Lattice-Based Phonotactic Language Recognition System with CMLLR Adaptation and Its Implementation Issues. IALP 2009: 285-288 - [c137]Tran Huy Dat, Haizhou Li:
Sound event classification based on Feature Integration, Recursive Feature Elimination and Structured Classification. ICASSP 2009: 177-180 - [c136]Donglai Zhu, Bin Ma, Haizhou Li:
Joint map adaptation of feature transformation and Gaussian Mixture Model for speaker recognition. ICASSP 2009: 4045-4048 - [c135]Tin Lay Nwe, Hanwu Sun, Haizhou Li, Susanto Rahardja:
Speaker diarization in meeting audio. ICASSP 2009: 4073-4076 - [c134]Trung Hieu Nguyen, Haizhou Li, Chng Eng Siong:
Cluster criterion functions in spectral subspace and their application in speaker clustering. ICASSP 2009: 4085-4088 - [c133]Haizhou Li, Bin Ma, Kong-Aik Lee, Hanwu Sun, Donglai Zhu, Khe Chai Sim, Changhuai You, Rong Tong, Ismo Kärkkäinen, Chien-Lin Huang, Vladimir Pervouchine, Wu Guo, Yijie Li, Li-Rong Dai, Mohaddeseh Nosratighods, Tharmarajah Thiruvaran, Julien Epps, Eliathamby Ambikairajah, Chng Eng Siong, Tanja Schultz, Qin Jin:
The I4U system in NIST 2008 speaker recognition evaluation. ICASSP 2009: 4201-4204 - [c132]Chang Huai You, Kong-Aik Lee, Haizhou Li:
A GMM supervector Kernel with the Bhattacharyya distance for SVM based speaker recognition. ICASSP 2009: 4221-4224 - [c131]Yanhua Long, Bin Ma, Haizhou Li, Wu Guo, Chng Eng Siong, Li-Rong Dai:
Exploiting prosodic information for Speaker Recognition. ICASSP 2009: 4225-4228 - [c130]Mohaddeseh Nosratighods, Tharmarajah Thiruvaran, Julien Epps, Eliathamby Ambikairajah, Bin Ma, Haizhou Li:
Evaluation of a fused FM and cepstral-based speaker recognition system on the NIST 2008 SRE. ICASSP 2009: 4233-4236 - [c129]Hanwu Sun, Bin Ma, Haizhou Li:
Cross-validation of multiple language recognition systems using pseudo keys. ICASSP 2009: 4353-4356 - [c128]Jin-Shea Kuo, Haizhou Li, Chih-Lung Lin:
Harvesting Regional Transliteration Variants with Guided Search. ICCPOL 2009: 133-144 - [c127]Lei Wang, Chng Eng Siong, Haizhou Li:
Efficient sparse self-similarity matrix construction for repeating sequence detection. ICME 2009: 458-461 - [c126]Bin Ma, Donglai Zhu, Haizhou Li:
Acoustic segment modeling for speaker recognition. ICME 2009: 1668-1671 - [c125]Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng, Kong-Aik Lee:
Target-aware language models for spoken language recognition. INTERSPEECH 2009: 200-203 - [c124]Hanwu Sun, Tin Lay Nwe, Bin Ma, Haizhou Li:
Speaker diarization for meeting room audio. INTERSPEECH 2009: 900-903 - [c123]Ling Cen, Minghui Dong, Paul Y. Chan, Haizhou Li:
Unit selection based speech synthesis for poor channel condition. INTERSPEECH 2009: 2075-2078 - [c122]Donglai Zhu, Bin Ma, Haizhou Li:
Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognition. INTERSPEECH 2009: 2179-2182 - [c121]Omid Dehzangi, Bin Ma, Engsiong Chng, Haizhou Li:
Discriminative feature transformation using output coding for speech recognition. INTERSPEECH 2009: 2979-2982 - [c120]Khe Chai Sim, Haizhou Li:
Stream-based context-sensitive phone mapping for cross-lingual speech recognition. INTERSPEECH 2009: 3019-3022 - [c119]Xiangyu Duan, Deyi Xiong, Hui Zhang, Min Zhang, Haizhou Li:
I2r's machine translation system for IWSLT 2009. IWSLT 2009: 50-54 - [c118]Xiangyu Duan, Deyi Xiong, Hui Zhang, Min Zhang, Haizhou Li:
I2R's machine translation system for IWSLT 2009. IWSLT (Evaluation Campaign) 2009 - [c117]D. W. K. Wong, J. Liu, J. H. Lim, Haizhou Li, Tien Yin Wong:
Automated detection of kinks from blood vessels for optic cup segmentation in retinal images. Medical Imaging: Computer-Aided Diagnosis 2009: 72601J - [c116]J. Liu, Damon Wing Kee Wong, J. H. Lim, Haizhou Li, Ngan Meng Tan, Tien Yin Wong:
ARGALI: an automatic cup-to-disc ratio measurement system for glaucoma detection and AnaLysIs framework. Medical Imaging: Computer-Aided Diagnosis 2009: 72603K - [c115]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
Efficient Beam Thresholding for Statistical Machine Translation. MTSummit 2009 - [c114]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
A Source Dependency Model for Statistical Machine Translation. MTSummit 2009 - [c113]Boon Siew Han, Wee Kiat Ho, Adrian Hwang Jian Tay, Tzer Liang Ng, Ai Ping Yow, I-Ming Chen, Song Huat Yeo, Haizhou Li:
A life-size robotic lion dance system with integrated motion control. RO-MAN 2009: 687-692 - [e3]Haizhou Li, A. Kumaran:
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration, NEWS@IJCNLP 2009, Singapore, August 7, 2009. Association for Computational Linguistics 2009, ISBN 978-1-932432-57-2 [contents] - [e2]Min Zhang, Haizhou Li, Kim-Teng Lua, Minghui Dong:
2009 International Conference on Asian Language Processing, IALP 2009, Singapore, December 7-9, 2009. IEEE Computer Society 2009, ISBN 978-0-7695-3904-1 [contents] - 2008
- [j14]Oi Yee Kwong, Haizhou Li:
Guest Editors' Introduction. Int. J. Comput. Process. Orient. Lang. 21(2): 97-99 (2008) - [j13]Haizhou Li, Jin-Shea Kuo, Jian Su, Chih-Lung Lin:
Mining Live Transliterations Using Incremental Learning Algorithms. Int. J. Comput. Process. Orient. Lang. 21(2): 183-203 (2008) - [j12]Jin-Shea Kuo, Haizhou Li, Ying-Kuei Yang:
Active learning for constructing transliteration lexicons from the Web. J. Assoc. Inf. Sci. Technol. 59(1): 126-135 (2008) - [j11]Khe Chai Sim, Haizhou Li:
On Acoustic Diversification Front-End for Spoken Language Identification. IEEE Trans. Speech Audio Process. 16(5): 1029-1037 (2008) - [j10]Donglai Zhu, Haizhou Li, Bin Ma, Chin-Hui Lee:
Optimizing the Performance of Spoken Language Recognition With Discriminative Training. IEEE Trans. Speech Audio Process. 16(8): 1642-1653 (2008) - [j9]Xiong Xiao, Chng Eng Siong, Haizhou Li:
Normalization of the Speech Modulation Spectra for Robust Speech Recognition. IEEE Trans. Speech Audio Process. 16(8): 1662-1674 (2008) - [c112]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
A Linguistically Annotated Reordering Model for BTG-based Statistical Machine Translation. ACL (2) 2008: 149-152 - [c111]Boxing Chen, Min Zhang, AiTi Aw, Haizhou Li:
Exploiting N-best Hypotheses for SMT Self-Enhancement. ACL (2) 2008: 157-160 - [c110]Min Zhang, Hongfei Jiang, AiTi Aw, Haizhou Li, Chew Lim Tan, Sheng Li:
A Tree Sequence Alignment-based Tree-to-Tree Translation Model. ACL 2008: 559-567 - [c109]Vladimir Pervouchine, Graham Leedham, Haishan Zhong, David Cho, Haizhou Li:
Comparative Study of Several Novel Acoustic Features for Speaker Recognition. BIOSIGNALS (1) 2008: 220-223 - [c108]Minghui Dong, Donglai Zhu, Bin Ma, Haizhou Li:
I2R's Submission to Blizzard Challenge 2008. Blizzard Challenge 2008 - [c107]Boxing Chen, Min Zhang, AiTi Aw, Haizhou Li:
Regenerating Hypotheses for Statistical Machine Translation. COLING 2008: 105-112 - [c106]Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
Linguistically Annotated BTG for Statistical Machine Translation. COLING 2008: 1009-1016 - [c105]Min Zhang, Hongfei Jiang, Haizhou Li, AiTi Aw, Sheng Li:
Grammar Comparison Study for Translational Equivalence Modeling and Statistical Machine Translation. COLING 2008: 1097-1104 - [c104]Swe Zin Kalayar Khine, Tin Lay Nwe, Haizhou Li:
Singing voice detection in pop songs using co-training algorithm. ICASSP 2008: 1629-1632 - [c103]Tin Lay Nwe, Haizhou Li:
On fusion of timbre-motivated features for singing voice detection and singer identification. ICASSP 2008: 2225-2228 - [c102]Tran Huy Dat, Haizhou Li:
Jump function komogorov and its application for audio stream segmentation and classification. ICASSP 2008: 3353-3356 - [c101]Kong-Aik Lee, Changhuai You, Haizhou Li:
Spoken Language recognition using support vector machines with generative front-end. ICASSP 2008: 4153-4156 - [c100]Donglai Zhu, Haizhou Li, Bin Ma, Chin-Hui Lee:
Discriminative learning for optimizing detection performance in spoken language recognition. ICASSP 2008: 4161-4164 - [c99]Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng:
Target-oriented phone tokenizers for spoken language recognition. ICASSP 2008: 4221-4224 - [c98]Khe Chai Sim, Haizhou Li:
Robust phone set mapping using decision tree clustering for cross-lingual phone recognition. ICASSP 2008: 4309-4312 - [c97]Chang Huai You, Susanto Rahardja, Haizhou Li:
Speech enhancement for telephony name speech recognition. ICME 2008: 973-976 - [c96]Chien-Lin Huang, Chung-Hsien Wu, Haizhou Li, Chia-Hsin Hsieh, Bin Ma:
Unsupervised pronunciation grammar growing using knowledge-based and data-driven approaches. ICME 2008: 1097-1100 - [c95]Omid Dehzangi, Bin Ma, Chng Eng Siong, Haizhou Li:
Fuzzy rule selection using Iterative Rule Learning for speech data classification. ICPR 2008: 1-4 - [c94]Jin-Shea Kuo, Haizhou Li, Chih-Lung Lin:
Mining Transliterations from Web Query Results: An Incremental Approach. IJCNLP 2008: 16-23 - [c93]Min Zhang, Chengjie Sun, Haizhou Li, AiTi Aw, Chew Lim Tan, Xiaolong Wang:
Name Origin Recognition Using Maximum Entropy Model and Diverse Features. IJCNLP 2008: 56-63 - [c92]Jin-Shea Kuo, Haizhou Li:
Multi-View Co-Training of Transliteration Model. IJCNLP 2008: 373-380 - [c91]Trung Hieu Nguyen, Engsiong Chng, Haizhou Li:
T-test distance and clustering criterion for speaker diarization. INTERSPEECH 2008: 36-39 - [c90]Rong Tong, Bin Ma, Haizhou Li, Engsiong Chng:
Target-oriented phone selection from universal phone set for spoken language recognition. INTERSPEECH 2008: 715-718 - [c89]Swe Zin Kalayar Khine, Tin Lay Nwe, Haizhou Li:
Speech/laughter classification in meeting audio. INTERSPEECH 2008: 793-796 - [c88]Donglai Zhu, Bin Ma, Haizhou Li:
Using MAP estimation of feature transformation for speaker recognition. INTERSPEECH 2008: 849-852 - [c87]Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen, Donglai Zhu:
Characterizing speech utterances for speaker verification with sequence kernel SVM. INTERSPEECH 2008: 1397-1400 - [c86]Tran Huy Dat, Haizhou Li:
Speaker identification in noise mismatch conditions based on jump function Kolmogorov analysis in wavelet domain. INTERSPEECH 2008: 1469-1472 - [c85]Chien-Lin Huang, Bin Ma, Chung-Hsien Wu, Brian Mak, Haizhou Li:
Robust speaker verification using short-time frequency with long-time window and fusion of multi-resolutions. INTERSPEECH 2008: 1897-1900 - [c84]Tin Lay Nwe, Minghui Dong, Swe Zin Kalayar Khine, Haizhou Li:
Multi-speaker meeting audio segmentation. INTERSPEECH 2008: 2522-2525 - [c83]Namunu Chinthaka Maddage, Haizhou Li:
Rhythm based music segmentation and octave scale cepstral features for sung language recognition. INTERSPEECH 2008: 2526-2529 - [c82]Khe Chai Sim, Haizhou Li:
Context-sensitive probabilistic phone mapping model for cross-lingual speech recognition. INTERSPEECH 2008: 2715-2718 - [c81]Xiong Xiao, Chng Eng Siong, Haizhou Li:
Effect of Feature Smoothing for Robust Speech Recognition. ISCSLP 2008: 73-76 - [c80]Omid Dehzangi, Bin Ma, Chng Eng Siong, Haizhou Li:
Discriminative Output Coding Features for Speech Recognition. ISCSLP 2008: 89-92 - [c79]Minghui Dong, Haizhou Li:
Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis. ISCSLP 2008: 133-136 - [c78]Hanwu Sun, Bin Ma, Haizhou Li:
Using Pseudo-Key for Language Recognition System Design. ISCSLP 2008: 173-176 - [c77]Chang Huai You, Kong-Aik Lee, Bin Ma, Haizhou Li:
Self-Organized Clustering for Feature Mapping in Language Recognition. ISCSLP 2008: 177-180 - [c76]Hanwu Sun, Bin Ma, Haizhou Li:
An Efficient Feature Selection Method for Speaker Recognition. ISCSLP 2008: 181-184 - [c75]Shuanhu Bai, Haizhou Li:
PLSA Based Topic Mixture Language Modeling Approach. ISCSLP 2008: 185-188 - [c74]Boxing Chen, Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li:
I2r multi-pass machine translation system for IWSLT 2008. IWSLT 2008: 46-51 - [c73]Maxim Khalilov, Marta R. Costa-jussà, Carlos A. Henríquez Q., José A. R. Fonollosa, Adolfo Hernandez, José B. Mariño, Rafael E. Banchs, Boxing Chen, Min Zhang, AiTi Aw, Haizhou Li:
The TALP&I2r SMT systems for IWSLT 2008. IWSLT 2008: 116-123 - [c72]Namunu Chinthaka Maddage, Mohan S. Kankanhalli, Haizhou Li:
Effectiveness of Signal Segmentation for Music Content Representation. MMM 2008: 477-486 - [c71]Tomi Kinnunen, Kong-Aik Lee, Haizhou Li:
Dimension reduction of the modulation spectrogram for speaker verification. Odyssey 2008: 30 - [c70]Haizhou Li, Bin Ma, Kong-Aik Lee, Khe Chai Sim, Hanwu Sun, Rong Tong, Donglai Zhu, Changhuai You:
NIST 2007 Language Recognition Evaluation: From the Perspective of IIR. PACLIC 2008: 46-57 - [c69]Tee Kiah Chia, Khe Chai Sim, Haizhou Li, Hwee Tou Ng:
A lattice-based approach to query-by-example spoken document retrieval. SIGIR 2008: 363-370 - 2007
- [j8]Minghui Dong, Haizhou Li, Tin Lay Nwe:
Evaluating Prosody of Mandarin Speech for Language Learning. J. Chin. Lang. Comput. 17(4): 219-226 (2007) - [j7]Xiong Xiao, Chng Eng Siong, Haizhou Li:
Temporal Structure Normalization of Speech Feature for Robust Speech Recognition. IEEE Signal Process. Lett. 14(7): 500-503 (2007) - [j6]Jin-Shea Kuo, Haizhou Li, Ying-Kuei Yang:
A phonetic similarity model for automatic extraction of transliteration pairs. ACM Trans. Asian Lang. Inf. Process. 6(2): 6 (2007) - [j5]Haizhou Li, Bin Ma, Chin-Hui Lee:
A Vector Space Modeling Approach to Spoken Language Identification. IEEE Trans. Speech Audio Process. 15(1): 271-284 (2007) - [j4]Tin Lay Nwe, Haizhou Li:
Exploring Vibrato-Motivated Acoustic Features for Singer Identification. IEEE Trans. Speech Audio Process. 15(2): 519-530 (2007) - [j3]Bin Ma, Haizhou Li, Rong Tong:
Spoken Language Recognition Using Ensemble Classifiers. IEEE Trans. Speech Audio Process. 15(7): 2053-2062 (2007) - [c68]Haizhou Li, Khe Chai Sim, Jin-Shea Kuo, Minghui Dong:
Semantic Transliteration of Personal Names. ACL 2007 - [c67]Hendra Setiawan, Min-Yen Kan, Haizhou Li:
Ordering Phrases with Function Words. ACL 2007 - [c66]Chin-Wei Eugene Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Chng Eng Siong, Haizhou Li, Susanto Rahardja:
Speaker Diarization Using Direction of Arrival Estimate and Acoustic Feature Information: The I2R-NTU Submission for the NIST RT 2007 Evaluation. CLEAR 2007: 484-496 - [c65]Swe Zin Kalayar Khine, Tin Lay Nwe, Haizhou Li:
Exploring Perceptual Based Timbre Feature for Singer Identification. CMMR 2007: 159-171 - [c64]Tee Kiah Chia, Haizhou Li, Hwee Tou Ng:
A Statistical Language Modeling Approach to Lattice-Based Spoken Document Retrieval. EMNLP-CoNLL 2007: 810-818 - [c63]Donglai Zhu, Bin Ma, Haizhou Li, Qiang Huo:
A Generalized Feature Transformation Approach for Channel Robust Speaker Verification. ICASSP (4) 2007: 61-64 - [c62]Rong Tong, Haizhou Li, Bin Ma, Engsiong Chng, Siu-Yeung Cho:
Spoken Language Recognition with Relevance Feedback. ICASSP (4) 2007: 861-864 - [c61]Bin Ma, Rong Tong, Haizhou Li:
Discriminative Vector for Spoken Language Recognition. ICASSP (4) 2007: 1001-1004 - [c60]Xiong Xiao, Engsiong Chng, Haizhou Li:
Normalizing the Speech Modulation Spectrum for Robust Speech Recognition. ICASSP (4) 2007: 1021-1024 - [c59]Swe Zin Kalayar Khine, Tin Lay Nwe, Haizhou Li:
On Timbre Based perceptual Feature for Singer identification. ICMC 2007 - [c58]Lei Wang, Haizhou Li, Engsiong Chng:
A Vector-Based Approach to Broadcast Audio Database Indexing and Retrieval. ICME 2007: 512-515 - [c57]Khe Chai Sim, Haizhou Li:
Fusion of contrastive acoustic models for parallel phonotactic spoken language identification. INTERSPEECH 2007: 170-173 - [c56]Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen:
A GMM-based probabilistic sequence kernel for speaker verification. INTERSPEECH 2007: 294-297 - [c55]Xiong Xiao, Engsiong Chng, Haizhou Li:
Evaluating the temporal structure normalisation technique on the Aurora-4 task. INTERSPEECH 2007: 1070-1073 - [c54]Chin-Wei Eugene Koh, Hanwu Sun, Tin Lay Nwe, Trung Hieu Nguyen, Bin Ma, Engsiong Chng, Haizhou Li, Susanto Rahardja:
Using direction of arrival estimate and acoustic feature information in speaker diarization. INTERSPEECH 2007: 2149-2152 - [c53]Tin Lay Nwe, Haizhou Li:
Singing voice detection using perceptually-motivated features. ACM Multimedia 2007: 309-312 - 2006
- [j2]Bin Ma, Haizhou Li:
A Comparative Study of Four Language Identification Systems. Int. J. Comput. Linguistics Chin. Lang. Process. 11(2) (2006) - [j1]Minghui Dong, Kim-Teng Lua, Haizhou Li:
A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese. J. Chin. Lang. Comput. 16(3): 135-144 (2006) - [c52]Jin-Shea Kuo, Haizhou Li, Ying-Kuei Yang:
Learning Transliteration Lexicons from the Web. ACL 2006 - [c51]Rong Tong, Bin Ma, Donglai Zhu, Haizhou Li, Engsiong Chng:
Integrating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification. ICASSP (1) 2006: 205-208 - [c50]Haizhou Li, Tin Lay Nwe:
Vibrato-Motivated Acoustic Features for Singger Identification. ICASSP (5) 2006: 533-536 - [c49]Shuanhu Bai, Haizhou Li:
Bayesian Learning of N-Gram Statistical Language Modeling. ICASSP (1) 2006: 1045-1048 - [c48]Namunu Chinthaka Maddage, Mohan S. Kankanhalli, Haizhou Li:
A Hierarchical Approach for Music Chord Modeling Based on the Analysis of Tonal Characteristics. ICME 2006: 945-948 - [c47]Minghui Dong, Haizhou Li, Tin Lay Nwe:
Evaluating prosody of Mandarin speech for language learning. INTERSPEECH 2006 - [c46]Haizhou Li, Bin Ma, Rong Tong:
Vector-based spoken language recognition using output coding. INTERSPEECH 2006 - [c45]Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li:
Speaker cluster based GMM tokenization for speaker recognition. INTERSPEECH 2006 - [c44]Tin Lay Nwe, Haizhou Li, Minghui Dong:
Analysis and detection of speech under sleep deprivation. INTERSPEECH 2006 - [c43]Xiong Xiao, Haizhou Li, Engsiong Chng:
Vector Autoregressive Model for Missing Feature Reconstruction. ISCSLP (Selected Papers) 2006: 315-324 - [c42]Manuel Giuliani, Tin Lay Nwe, Haizhou Li:
Meeting Segmentation Using Two-Layer Cascaded Subband Filters. ISCSLP (Selected Papers) 2006: 672-682 - [c41]Tomi Kinnunen, Chin-Wei Eugene Koh, Lei Wang, Haizhou Li, Eng Siong Chng:
Temporal Discrete Cosine Transform: Towards Longer Term Temporal Features for Speaker Verification. ISCSLP 2006 - [c40]Kong-Aik Lee, Hanwu Sun, Rong Tong, Bin Ma, Minghui Dong, Changhuai You, Donglai Zhu, Chin-Wei Eugene Koh, Lei Wang, Tomi Kinnunen, Chng Eng Siong, Haizhou Li:
The IIR Submission to CSLP 2006 Speaker Recognition Evaluation. ISCSLP (Selected Papers) 2006: 494-505 - [c39]Rong Tong, Bin Ma, Kong-Aik Lee, Changhuai You, Donglai Zhu, Tomi Kinnunen, Hanwu Sun, Minghui Dong, Chng Eng Siong, Haizhou Li:
Fusion of Acoustic and Tokenization Features for Speaker Recognition. ISCSLP (Selected Papers) 2006: 566-577 - [c38]Donglai Zhu, Rong Tong, Bin Ma, Haizhou Li:
Minimum Classification Error Based Optimal Linear Combination for Spoken Language Identification. ISCSLP 2006 - [c37]Denny Iskandar, Ye Wang, Min-Yen Kan, Haizhou Li:
Syllabic level automatic synchronization of music signals and text lyrics. ACM Multimedia 2006: 659-662 - [c36]Jinyu Li, Sibel Yaman, Chin-Hui Lee, Bin Ma, Rong Tong, Donglai Zhu, Haizhou Li:
Language Recognition Based on Score Distribution Feature Vectors and Discriminative Classifier Fusion. Odyssey 2006: 1-5 - [c35]Namunu Chinthaka Maddage, Haizhou Li, Mohan S. Kankanhalli:
Music structure based vector space retrieval. SIGIR 2006: 67-74 - [e1]Qiang Huo, Bin Ma, Chng Eng Siong, Haizhou Li:
Chinese Spoken Language Processing, 5th International Symposium, ISCSLP 2006, Singapore, December 13-16, 2006, Selected Papers. Lecture Notes in Computer Science 4274, Springer 2006, ISBN 3-540-49665-3 [contents] - 2005
- [c34]Haizhou Li, Bin Ma:
A Phonotactic Language Model for Spoken Language Identification. ACL 2005: 515-522 - [c33]Boon Pang Lim, Haizhou Li, Bin Ma:
Using Local & Global Phonotactic Features in Chinese Dialect Identification. ICASSP (1) 2005: 577-580 - [c32]Tin Lay Nwe, Haizhou Li:
Broadcast news segmentation by audio type analysis. ICASSP (2) 2005: 1065-1068 - [c31]Hendra Setiawan, Haizhou Li, Min Zhang, Beng Chin Ooi:
Phrase-Based Statistical Machine Translation: A Level of Detail Approach. IJCNLP 2005: 576-587 - [c30]Min Zhang, Haizhou Li, Jian Su, Hendra Setiawan:
A Phrase-Based Context-Dependent Joint Probability Model for Named Entity Translation. IJCNLP 2005: 600-611 - [c29]Tin Lay Nwe, Haizhou Li:
Identifying singers of popular songs. INTERSPEECH 2005: 129-132 - [c28]Bin Ma, Haizhou Li, Chin-Hui Lee:
An acoustic segment modeling approach to automatic language identification. INTERSPEECH 2005: 2829-2832 - [c27]Sheng Gao, Bin Ma, Haizhou Li, Chin-Hui Lee:
A text categorization approach to automatic language identification. INTERSPEECH 2005: 2837-2840 - [c26]Minghui Dong, Kim-Teng Lua, Haizhou Li:
A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS. INTERSPEECH 2005: 3245-3248 - [c25]C. Santhosh Kumar, V. P. Mohandas, Haizhou Li:
Multilingual speech recognition: a unified approach. INTERSPEECH 2005: 3357-3360 - [c24]Kathiresan Manickam, Haizhou Li:
Complexity analysis of normal and deaf infant cry acoustic waves. MAVEBA 2005: 105-108 - [c23]Hendra Setiawan, Haizhou Li, Min Zhang:
Learning Phrase Translation using Level of Detail Approach. MTSummit 2005: 243-250 - [c22]Bin Ma, Haizhou Li:
A phonotactic-semantic paradigm for automatic spoken document classification. SIGIR 2005: 369-376 - 2004
- [c21]Haizhou Li, Min Zhang, Jian Su:
A Joint Source-Channel Model for Machine Transliteration. ACL 2004: 159-166 - [c20]Min Zhang, Haizhou Li, Jian Su:
Direct Orthographical Mapping for Machine Transliteration. COLING 2004 - [c19]Jun Xu, Guohong Fu, Haizhou Li:
Grapheme-to-phoneme conversion for Chinese text-to-speech. INTERSPEECH 2004: 1885-1888 - [c18]Boon Pang Lim, Haizhou Li, Yu Chen:
Language identification through large vocabulary continuous speech recognition. ISCSLP 2004: 49-52 - 2003
- [c17]Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li:
On unit analysis for Cantonese corpus-based TTS. INTERSPEECH 2003: 269-272 - 2002
- [c16]Bin Ma, Cuntai Guan, Haizhou Li, Chin-Hui Lee:
Multilingual speech recognition with language identification. INTERSPEECH 2002: 505-508 - [c15]Haizhou Li:
Concatenative Chinese speech synthesis and quality evaluation. ISCSLP 2002 - [c14]Bin Ma, Cuntai Guan, Haizhou Li:
Likelihood probability mismatch analysis and normalization in multilingual speech applications. ISCSLP 2002 - [c13]Min Zhang, Cuntai Guan, Haizhou Li:
Equivalent node-based speech grammar optimization. ISCSLP 2002 - 2000
- [c12]Min Zhang, Engsiong Chng, Haizhou Li:
Semi-class-based N-gram Language Modeling for Chinese Dictation. ISCSLP 2000
1990 – 1999
- 1998
- [c11]Shuanhu Bai, Haizhou Li, Zhiwei Lin, Baosheng Yuan:
Building class-based language models with contextual statistics. ICASSP 1998: 173-176 - [c10]Haizhou Li, Zhiwei Lin, Shuanhu Bai:
Chinese Sentence Tokenization Using Viterbi Decoder. ISCSLP 1998 - [c9]Cuntai Guan, Haizhou Li, Baosheng Yuan, Zhiwei Lin:
Data-driven Acoustic Modeling Approach for Chinese LVCSR. ISCSLP 1998 - [c8]Baosheng Yuan, Cuntai Guan, Gareth Loudon, Haizhou Li:
Optimization of Parameter Tying for Chinese Acoustic Modeling. ISCSLP 1998 - [c7]Haizhou Li, Baosheng Yuan:
Chinese Word Segmentation. PACLIC 1998: 212-217 - 1996
- [c6]Jian Su, Haizhou Li, Jean-Paul Haton, Kai-Tat Ng:
Speaker time-drifting adaptation using trajectory mixture hidden Markov models. ICASSP 1996: 709-712 - [c5]Haizhou Li, Yifan Gong, Jean-Paul Haton:
Probabilistic mapping networks for speaker recognition. ICASSP 1996: 3374-3377 - 1995
- [c4]Kai Tat Ng, Haizhou Li, Jean Paul Haton:
Some nonparametric distance measures in speaker verification. EUROSPEECH 1995: 317-320 - [c3]Haizhou Li, Jean Paul Haton, Yifan Gong:
On MMI learning of Gaussian mixture for speaker models. EUROSPEECH 1995: 363-366 - [c2]Haizhou Li, Jean Paul Haton, Jian Su, Yifan Gong:
Speaker recognition with temporal transition models. EUROSPEECH 1995: 617-620 - 1993
- [c1]Ping Hung Karl R. Leung, Haizhou Li:
Structured Specifications, Semantics, and System Semantics. SEKE 1993: 324-326
Coauthor Index
aka: Rafael Enrique Banchs
aka: Paul Yaozhu Chan
aka: Luis F. D'Haro
aka: Dongyan Huang
aka: Tomi H. Kinnunen
aka: Kong Aik Lee
aka: Changhuai You
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-08 21:29 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint