default search action
Odyssey 2022: Beijing, China
- Thomas Fang Zheng:
Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June - 1 July 2022, Beijing, China. ISCA 2022
Speaker Recognition 1
- Nikita Kuzmin, Igor Fedorov, Alexey Sholokhov:
Magnitude-Aware Probabilistic Speaker Embeddings. 1-8 - Anna Silnova, Themos Stafylakis, Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Pavel Matejka, Lukás Burget, Ondrej Glembek, Niko Brummer:
Analyzing Speaker Verification Embedding Extractors and Back-Ends Under Language and Channel Mismatch. 9-16 - Junyi Peng, Chunlei Zhang, Jan Honza Cernocký, Dong Yu:
Progressive Contrastive Learning for Self-Supervised Text-Independent Speaker Verification. 17-24 - Sandro Cumani, Salvatore Sarni:
Impostor Score Statistics as Quality Measures for the Calibration of Speaker Verification Systems. 25-32 - Jahangir Alam, Woo Hyun Kang, Abderrahim Fathan:
Hybrid Neural Network-Based Deep Embedding Extractors for Text-Independent Speaker Verification. 33-40 - Mohammad MohammadAmini, Driss Matrouf, Jean-François Bonastre, Sandipana Dowerah, Romain Serizel, Denis Jouvet:
Learning Noise Robust ResNet-Based Speaker Embedding for Speaker Recognition. 41-46
Spoofing and Countermeasure 1
- Anand Therattil, Priyanka Gupta, Piyushkumar K. Chodingala, Hemant A. Patil:
Teager Energy Based-Detection of One-point and Two-point Replay Attacks: Towards Cross-Database Generalization. 47-54 - Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan:
Investigation on Mixup Strategies for End-to-End Voice Spoof Detection System. 55-61 - Diego Castán, Md. Hafizur Rahman, Sarah Bakst, Chris Cobo-Kroenke, Mitchell McLaren, Martin Graciarena, Aaron Lawson:
Speaker-Targeted Synthetic Speech Detection. 62-69 - Wanying Ge, Massimiliano Todisco, Nicholas W. D. Evans:
Explainable Deepfake and Spoofing Detection: An Attack Analysis Using SHapley Additive exPlanations. 70-76 - You Zhang, Ge Zhu, Zhiyao Duan:
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification. 77-84 - Xuechen Liu, Md. Sahidullah, Tomi Kinnunen:
Spoofing-Aware Speaker Verification with Unsupervised Domain Adaptation. 85-91
Spoofing and Countermeasure 2
- Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. 92-99 - Xin Wang, Junichi Yamagishi:
Investigating Self-Supervised Front Ends for Speech Spoofing Countermeasures. 100-106 - Longting Xu, Mianxin Tian, Xing Guo, Zhiyong Shan, Jie Jia, Yiyuan Peng, Jichen Yang, Rohan Kumar Das:
A Novel Feature Based on Graph Signal Processing for Detection of Physical Access Attacks. 107-111 - Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas W. D. Evans:
Automatic Speaker Verification Spoofing and Deepfake Detection Using Wav2vec 2.0 and Data Augmentation. 112-119 - Wei Liu, Meng Sun, Xiongwei Zhang, Hugo Van hamme, Thomas Fang Zheng:
A Multi-Resolution Front-End for End-to-End Speech Anti-Spoofing. 120-125 - Jingze Lu, Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang:
Robust Cross-SubBand Countermeasure Against Replay Attacks. 126-132
Speaker Diarization
- Natsuo Yamashita, Shota Horiguchi, Takeshi Homma:
Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization. 133-140 - Joonas Kalda, Tanel Alumäe:
Collar-Aware Training for Streaming Speaker Change Detection in Broadcast Speech. 141-147 - Chenguang Hu, Qingran Zhan, Miao Liu, Xiang Xie:
BIT Submission for the Conversational Speaker Diarization Challenge. 148-155 - Yijun Gong, Xiao-Lei Zhang:
DP-Means: An Efficient Bayesian Nonparametric Model for Speaker Diarization. 156-161 - Yucong Zhang, Qingjian Lin, Weiqing Wang, Lin Yang, Xuyang Wang, Junjie Wang, Ming Li:
Low-Latency Online Speaker Diarization with Graph-Based Label Generation. 162-169 - Zuoer Chen, Liang He:
A Quick and Effective Speaker Diarization System. 170-177
Speaker Recognition 2
- Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan:
Domain Generalized Speaker Embedding Learning via Mutual Information Minimization. 178-184 - Alexey Sholokhov, Xuechen Liu, Md. Sahidullah, Tomi Kinnunen:
Baselines and Protocols for Household Speaker Recognition. 185-192 - Yosef Solewicz, Noa Cohen, Johan Rohdin, Srikanth R. Madikeri, Jan Honza Cercnocký:
Speaker Recognition on Mono-Channel Telephony Recordings. 193-199 - Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio López-Moreno:
Parameter-Free Attentive Scoring for Speaker Verification. 200-206 - Sarah Bakst, Chris Cobo-Kroenke, Aaron Lawson, Mitchell McLaren, Allen R. Stauffer:
Time-Varying Score Reliability Prediction in Speaker Identification. 207-212 - Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Magdalena Rybicka, Carlos D. Castillo, Jaejin Cho, L. Paola García-Perera, Pedro A. Torres-Carrasquillo, Najim Dehak:
Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21. 213-220
Speaker and Language Recognition
- Yanxiong Li, Wucheng Wang, Hao Chen, Wenchang Cao, Wei Li, Qianhua He:
Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention. 221-227 - Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li:
Deep Representation Decomposition for Rate-Invariant Speaker Verification. 228-232 - Madina Abdrakhmanova, Saniya Abushakimova, Yerbolat Khassanov, Huseyin Atakan Varol:
A Study of Multimodal Person Verification Using Audio-Visual-Thermal Data. 233-239 - Tanel Alumäe, Kunnar Kukk:
Pretraining Approaches for Spoken Language Recognition: TalTech Submission to the OLR 2021 Challenge. 240-247 - Hexin Liu, Leibny Paola García-Perera, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles, Sanjeev Khudanpur:
Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation. 248-254 - Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio López-Moreno:
Attentive Temporal Pooling for Conformer-Based Streaming Language Identification in Long-Form Speech. 255-262
Voice Synthesis, Anonymization and Separation
- David Guennec, Hassan Hajipoor, Gwénolé Lecorvé, Pascal Lintanf, Damien Lolive, Antoine Perquin, Gaëlle Vidal:
BreizhCorpus: A Large Breton Language Speech Corpus and Its Use for Text-to-Speech Synthesis. 263-270 - Haoran Sun, Chen Chen, Lantian Li, Dong Wang:
Cycleflow: Purify Information Factors by Cycle Loss. 271-278 - Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Natalia A. Tomashenko:
Language-Independent Speaker Anonymization Approach Using Self-Supervised Pre-Trained Models. 279-286 - Hiroto Kai, Shinnosuke Takamichi, Sayaka Shiota, Hitoshi Kiya:
Robustness of Signal Processing-Based Pseudonymization Method Against Decryption Attack. 287-293 - Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ian McGraw:
Closing the Gap Between Single-User and Multi-User VoiceFilter-Lite. 294-300 - Jincheng He, Yuanyuan Bao, Na Xu, Hongfeng Li, Shicong Li, Linzhang Wang, Fei Xiang, Ming Li:
Single-Channel Target Speaker Separation Using Joint Training with Target Speaker's Pitch Information. 301-305
Evaluation and Benchmarking
- Lantian Li, Di Wang, Wenqiang Du, Dong Wang:
C-P Map: A Novel Evaluation Toolkit for Speaker Verification. 306-313 - Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Lisa P. Mason, Douglas A. Reynolds:
The NIST CTS Speaker Recognition Challenge. 314-321 - Seyed Omid Sadjadi, Craig S. Greenberg, Elliot Singer, Lisa P. Mason, Douglas A. Reynolds:
The 2021 NIST Speaker Recognition Evaluation. 322-329 - Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md. Sahidullah, Tomi Kinnunen, Nicholas W. D. Evans:
Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion. 330-337 - Jesús Villalba, Bengt J. Borgstrom, Saurabh Kataria, Jaejin Cho, Pedro A. Torres-Carrasquillo, Najim Dehak:
Advances in Speaker Recognition for Multilingual Conversational Telephone Speech: The JHU-MIT System for NIST SRE20 CTS Challenge. 338-345 - Jahangir Alam, Radek Benes, Marian Beszédes, Lukás Burget, Mohamed Dahmane, Abderrahim Fathan, Hamed Ghodrati, Ondrej Glembek, Woo Hyun Kang, Pavel Matejka, Ladislav Mosner, Oldrich Plchot, Johan Rohdin, Anna Silnova, Themos Stafylakis:
Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation. 346-353 - Galina Lavrentyeva, Sergey Novoselov, Vladimir Volokhov, Anastasia Avdeeva, Aleksei Gusev, Alisa Vinogradova, Igor Korsunov, Alexander Kozlov, Timur Pekhovsky, Andrey Shulipa, Evgeny Smirnov, Vasiliy Galyuk:
STC Speaker Recognition System for the NIST SRE 2021. 354-361
Special Session: CNSRC 2022
- YingWei Tan, XueFeng Ding:
The Volkswagen-Mobvoi System for CN-Celeb Speaker Recognition Challenge 2022. 362-367 - Jialin Zhang, Qinghua Ren, You-cai Qin, Zi-Kai Wan, Qirong Mao:
Cross-Scene Speaker Verification Based on Dynamic Convolution for the CNSRC 2022 Challenge. 368-375 - Woo Hyun Kang, Jahangir Alam:
Investigation on Deep Speaker Embedding Extraction Methods for Multi-Genre Speaker Verification. 376-383 - Xinmei Su, Qingran Zhan, Chenguang Hu, Xiang Xie:
Combination of Multiple Embeddings for Speaker Retrieval. 384-389
Speech Application
- Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang:
An Empirical Study of Weakly Supervised Audio Tagging Embeddings for General Audio Representations. 390-395 - Jintao Kang, Aijun Li, Jingyang Li:
Formant Dynamics of Chinese Compound Vowels with Implications for Forensic Speaker Identification. 396-401 - Haoxu Wang, Yan Jia, Zeqing Zhao, Xuyang Wang, Junjie Wang, Ming Li:
Generating TTS Based Adversarial Samples for Training Wake-Up Word Detection Systems Against Confusing Words. 402-406 - Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram:
Multimodal Emotion Recognition Using Transfer Learning from Speaker Recognition and BERT-Based Models. 407-414 - Zhuo Gong, Daisuke Saito, Longfei Yang, Takahiro Shinozaki, Sheng Li, Hisashi Kawai, Nobuaki Minematsu:
Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model. 415-420 - Sandip Ghimire, Tomi Kinnunen, Rosa González Hautamäki:
Gamified Speaker Comparison by Listening. 421-427
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.