default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 27
Volume 27, Number 1, January 2019
- Dilek Hakkani-Tür:
Inaugural Editorial Innovations in an Era of Ubiquitous Audio, Speech, and Language Processing. 5-6 - Feng Bao, Waleed H. Abdulla:
A New Ratio Mask Representation for CASA-Based Speech Enhancement. 7-19 - Paul Magron, Tuomas Virtanen:
Complex ISNMF: A Phase-Aware Model for Monaural Audio Source Separation. 20-31 - Thanh Thi Hien Duong, Ngoc Q. K. Duong, Cong-Phuong Nguyen, Quoc-Cuong Nguyen:
Gaussian Modeling-Based Multichannel Audio Source Separation Exploiting Generic Source Spectral Model. 32-43 - Guoqiang Zhang, Jiancheng Tao, Xiaojun Qiu, Ian S. Burnett:
Decentralized Two-Channel Active Noise Control for Single Frequency by Shaping Matrix Eigenvalues. 44-52 - Yan Zhao, Zhong-Qiu Wang, DeLiang Wang:
Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement. 53-62 - Naijun Zheng, Xiao-Lei Zhang:
Phase-Aware Speech Enhancement Based on Deep Neural Networks. 63-76 - Takafumi Moriya, Tomohiro Tanaka, Takahiro Shinozaki, Shinji Watanabe, Kevin Duh:
Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition. 77-88 - Herman Kamper, Gregory Shakhnarovich, Karen Livescu:
Semantic Speech Retrieval With a Visually Grounded Model of Untranscribed Speech. 89-98 - Mathew Shaji Kavalekalam, Jesper Kjær Nielsen, Jesper Bünsow Boldt, Mads Græsbøll Christensen:
Model-Based Speech Enhancement for Intelligibility Improvement in Binaural Hearing Aids. 99-113 - M. V. Achuth Rao, Prasanta Kumar Ghosh:
Glottal Inverse Filtering Using Probabilistic Weighted Linear Prediction. 114-124 - Yang Sun, Wenwu Wang, Jonathon A. Chambers, Syed Mohsen Naqvi:
Two-Stage Monaural Source Separation in Reverberant Room Environments Using Deep Neural Networks. 125-139 - Luciana Ferrer, Mahesh Kumar Nandwana, Mitchell McLaren, Diego Castán, Aaron Lawson:
Toward Fail-Safe Speaker Recognition: Trial-Based Calibration With a Reject Option. 140-153 - Jamal Amini, Richard C. Hendriks, Richard Heusdens, Meng Guo, Jesper Jensen:
Asymmetric Coding for Rate-Constrained Noise Reduction in Binaural Hearing Aids. 154-167 - Jianfei Yu, Jing Jiang, Rui Xia:
Global Inference for Aspect and Opinion Terms Co-Extraction Based on Multi-Task Neural Networks. 168-177 - Zhong-Qiu Wang, Xueliang Zhang, DeLiang Wang:
Robust Speaker Localization Guided by Deep Learning-Based Time-Frequency Masking. 178-188 - Ke Tan, Jitong Chen, DeLiang Wang:
Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement. 189-198 - Hoang Gia Ngo, Minh Nguyen, Nancy F. Chen:
Phonology-Augmented Statistical Framework for Machine Transliteration Using Limited Linguistic Resources. 199-211 - Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Yuta Kawachi, Noboru Harada:
Unsupervised Detection of Anomalous Sound Based on Deep Learning and the Neyman-Pearson Lemma. 212-224 - Yaron Laufer, Sharon Gannot:
A Bayesian Hierarchical Model for Speech Enhancement With Time-Varying Audio Channel. 225-239
Volume 27, Number 2, February 2019
- Toru Nakashika, Shinji Takaki, Junichi Yamagishi:
Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra. 244-254 - Feifei Xiong, Stefan Goetze, Birger Kollmeier, Bernd T. Meyer:
Joint Estimation of Reverberation Time and Early-To-Late Reverberation Ratio From Single-Channel Speech Signals. 255-267 - Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël A. P. Habets:
CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning. 268-282 - Morten Kolbaek, Zheng-Hua Tan, Jesper Jensen:
On the Relationship Between Short-Time Objective Intelligibility and Short-Time Spectral-Amplitude Mean-Square Error for Speech Enhancement. 283-295 - Martin Weiss Hansen, Jesper Rindom Jensen, Mads Græsbøll Christensen:
Estimation of Fundamental Frequencies in Stereophonic Music Mixtures. 296-310 - Junwei Bao, Duyu Tang, Nan Duan, Zhao Yan, Ming Zhou, Tiejun Zhao:
Text Generation From Tables. 311-320 - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
A Convex Approximation of the Relaxed Binaural Beamforming Optimization Problem. 321-331 - Tetsuya Hashimoto, Daisuke Saito, Nobuaki Minematsu:
Many-to-Many and Completely Parallel-Data-Free Voice Conversion Based on Eigenspace DNN. 332-341 - Fatemeh Pishdadian, Bryan Pardo:
Multi-Resolution Common Fate Transform. 342-354 - Yiming Wu, Wei Li:
Automatic Audio Chord Recognition With MIDI-Trained Deep Feature and BLSTM-CRF Sequence Decoding Model. 355-366 - Keisuke Imoto, Nobutaka Ono:
Acoustic Topic Model for Scene Analysis With Intermittently Missing Observations. 367-382 - Ke Xiao, Supin Wang, Mingxi Wan, Liang Wu:
Reconstruction of Mandarin Electrolaryngeal Fricatives With Hybrid Noise Source. 383-391 - Lakshmi Krishnan, Terence Betlehem, Paul D. Teal:
Fast Algorithms for Acoustic Impulse Response Shaping. 392-403 - Vahid Zakeri, Antony J. Hodgson:
Automatic Identification of Hard and Soft Bone Tissues by Analyzing Drilling Sounds. 404-414 - Stefan Bilbao, Brian Hamilton:
Directional Sources in Wave-Based Acoustic Simulation. 415-428 - Yichi Zhang, Bryan Pardo, Zhiyao Duan:
Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation. 429-441 - Fangchen Feng, Matthieu Kowalski:
Underdetermined Reverberant Blind Source Separation: Sparse Approaches for Multiplicative and Convolutive Narrowband Approximation. 442-456 - Zhong-Qiu Wang, DeLiang Wang:
Combining Spectral and Spatial Features for Deep Learning Based Blind Speaker Separation. 457-468
Volume 27, Number 3, March 2019
- Mohsen Zareian Jahromi, Adel Zahedi, Jesper Jensen, Jan Østergaard:
Information Loss in the Human Auditory System. 472-481 - Yaakov Buchris, Alon Amar, Jacob Benesty, Israel Cohen:
Incoherent Synthesis of Sparse Arrays for Frequency-Invariant Beamforming. 482-495 - Yogachandran Rahulamathavan, Kunaraj R. Sutharsini, Indranil Ghosh Ray, Rongxing Lu, Muttukrishnan Rajarajan:
Privacy-Preserving iVector-Based Speaker Verification. 496-506 - Jiajun Zhang, Yang Zhao, Haoran Li, Chengqing Zong:
Attention With Sparsity Regularization for Neural Machine Translation and Summarization. 507-518 - Alastair H. Moore, Wei Xue, Patrick A. Naylor, Mike Brookes:
Noise Covariance Matrix Estimation for Rotating Microphone Arrays. 519-530 - Guang Yang, Haibo He, Qian Chen:
Emotion-Semantic-Enhanced Neural Network. 531-543 - Thomas Dietzen, Ann Spriet, Wouter Tirry, Simon Doclo, Marc Moonen, Toon van Waterschoot:
Comparative Analysis of Generalized Sidelobe Cancellation and Multi-Channel Linear Prediction for Speech Dereverberation and Noise Reduction. 544-558 - Jianqing Gao, Jun Du, Enhong Chen:
Mixed-Bandwidth Cross-Channel Speech Recognition via Joint Optimization of DNN-Based Bandwidth Expansion and Acoustic Modeling. 559-571 - Salil Deena, Madina Hasan, Mortaza Doulaty, Oscar Saz, Thomas Hain:
Recurrent Neural Network Language Model Adaptation for Multi-Genre Broadcast Speech Recognition and Alignment. 572-582 - Femke B. Gelderblom, Tron V. Tronstad, Erlend Magnus Viggen:
Subjective Evaluation of a Noise-Reduced Training Target for Deep Neural Network-Based Speech Enhancement. 583-594 - Maria Luis Valero, Emanuël A. P. Habets:
Low-Complexity Multi-Microphone Acoustic Echo Control in the Short-Time Fourier Transform Domain. 595-609 - Qiaoxi Zhu, Philip Coleman, Xiaojun Qiu, Ming Wu, Jun Yang, Ian S. Burnett:
Robust Personal Audio Geometry Optimization in the SVD-Based Modal Domain. 610-620 - Jiangyan Yi, Jianhua Tao, Zhengqi Wen, Ye Bai:
Language-Adversarial Transfer Learning for Low-Resource Speech Recognition. 621-630 - Jing-Xuan Zhang, Zhen-Hua Ling, Li-Juan Liu, Yuan Jiang, Li-Rong Dai:
Sequence-to-Sequence Acoustic Modeling for Voice Conversion. 631-644 - Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud:
Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function. 645-659
Volume 27, Number 4, April 2019
- Ziyue Zhao, Huijun Liu, Tim Fingscheidt:
Convolutional Neural Networks to Enhance Coded Speech. 663-678 - Henning F. Schepker, Sven Erik Nordholm, Linh Thi Thuc Tran, Simon Doclo:
Null-Steering Beamformer-Based Feedback Cancellation for Multi-Microphone Hearing Aids With Incoming Signal Preservation. 679-691 - Zengxi Li, Yan Song, Li-Rong Dai, Ian McLoughlin:
Listening and Grouping: An Online Autoregressive Approach for Monaural Speech Separation. 692-703 - Dong Deng, Liping Jing, Jian Yu, Shaolong Sun, Michael K. Ng:
Sentiment Lexicon Construction With Hierarchical Supervision Topic Model. 704-718 - Mantong Zhou, Minlie Huang, Xiaoyan Zhu:
Story Ending Selection by Finding Hints From Pairwise Candidate Endings. 719-729 - Jan-Gerrit Richter, Janina Fels:
On the Influence of Continuous Subject Rotation During High-Resolution Head-Related Transfer Function Measurements. 730-741 - Jianguo Yu, Konstantin Markov, Tomoko Matsui:
Articulatory and Spectrum Information Fusion Based on Deep Recurrent Neural Networks. 742-752 - Fábio P. Itturriet, Márcio Holsbach Costa:
Perceptually Relevant Preservation of Interaural Time Differences in Binaural Hearing Aids. 753-764 - Johannes Abel, Tim Fingscheidt:
Sinusoidal-Based Lowband Synthesis for Artificial Speech Bandwidth Extension. 765-776 - Qiuqiang Kong, Yong Xu, Iwona Sobieraj, Wenwu Wang, Mark D. Plumbley:
Sound Event Detection and Time-Frequency Segmentation from Weakly Labelled Data. 777-787 - Yi-Lin Tuan, Hung-yi Lee:
Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation. 788-798 - Nikolaos Dionelis, Mike Brookes:
Modulation-Domain Kalman Filtering for Monaural Blind Speech Denoising and Dereverberation. 799-814 - Reza Lotfian, Carlos Busso:
Curriculum Learning for Speech Emotion Recognition From Crowdsourced Labels. 815-826 - Shoufeng Lin:
Robust Pitch Estimation and Tracking For Speakers Based on Subband Encoding and The Generalized Labeled Multi-Bernoulli Filter. 827-841 - Xianghui Wang, Israel Cohen, Jingdong Chen, Jacob Benesty:
On Robust and High Directive Beamforming With Small-Spacing Microphone Arrays for Scattered Sources. 842-852 - Zhe Quan, Zhi-Jie Wang, Yuquan Le, Bin Yao, Kenli Li, Jian Yin:
An Efficient Framework for Sentence Similarity Modeling. 853-865 - Nurul Lubis, Sakriani Sakti, Koichiro Yoshino, Satoshi Nakamura:
Positive Emotion Elicitation in Chat-Based Dialogue Systems. 866-877
Volume 27, Number 5, May 2019
- Francisco Javier Ibarrola, Ruben Daniel Spies, Leandro Ezequiel Di Persia:
Switching Divergences for Spectral Learning in Blind Speech Dereverberation. 881-891 - Israel Cohen, Jacob Benesty, Jingdong Chen:
Differential Kronecker Product Beamforming. 892-902 - Camelia Elisei-Iliescu, Constantin Paleologu, Jacob Benesty, Cristian Lucian Stanciu, Cristian Anghel, Silviu Ciochina:
Recursive Least-Squares Algorithms for the Identification of Low-Rank Systems. 903-918 - Anurendra Kumar, Tanaya Guha, Prasanta Kumar Ghosh:
Dirichlet Latent Variable Model: A Dynamic Model Based on Dirichlet Prior for Audio Processing. 919-931 - Peter Jancovic, Münevver Köküer:
Bird Species Recognition Using Unsupervised Modeling of Individual Vocalization Elements. 932-947 - Tomoki Koriyama, Takao Kobayashi:
Statistical Parametric Speech Synthesis Using Deep Gaussian Processes. 948-959 - Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara:
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition. 960-971 - Simon Widmark:
Causal MSE-Optimal Filters for Personal Audio Subject to Constrained Contrast. 972-987
Volume 27, Number 6, June 2019
- Annamaria Mesaros, Aleksandr Diment, Benjamin Elizalde, Toni Heittola, Emmanuel Vincent, Bhiksha Raj, Tuomas Virtanen:
Sound Event Detection in the DCASE 2017 Challenge. 992-1006 - Srikanth Raj Chetupalli, Thippur V. Sreenivas:
Late Reverberation Cancellation Using Bayesian Estimation of Multi-Channel Linear Predictors and Student's t-Source Prior. 1007-1018 - Lauri Juvela, Bajibabu Bollepalli, Vassilis Tsiaras, Paavo Alku:
GlotNet - A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis. 1019-1030 - Fiete Winter, Frank Schultz, Gergely Firtha, Sascha Spors:
A Geometric Model for Prediction of Spatial Aliasing in 2.5D Sound Field Synthesis. 1031-1046 - Yuanyuan Liu, Tan Lee, Thomas K. T. Law, Kathy Yuet-Sheung Lee:
Acoustical Assessment of Voice Disorder With Continuous Speech Using ASR Posterior Features. 1047-1059 - Christoph Pörschmann, Johannes M. Arend, Fabian Brinkmann:
Directional Equalization of Sparse Head-Related Transfer Function Sets for Spatial Upsampling. 1060-1071 - Shreyas Srikanth Payal, V. John Mathews, Douglas J. Button, Ajay Iyer, Russell H. Lambert, Jeffrey Hutchings, Luis Antonio Azpicueta-Ruiz:
Equalization of Nonlinear Propagation Distortion in Cylindrical Waveguides. 1072-1084 - Berrak Sisman, Mingyang Zhang, Haizhou Li:
Group Sparse Representation With WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion. 1085-1097 - Jinkyu Lee, Hong-Goo Kang:
A Joint Learning Algorithm for Complex-Valued T-F Masks in Deep Learning-Based Single-Channel Speech Enhancement Systems. 1098-1109
Volume 27, Number 7, July 2019
- Jan-Hendrik Flesner, Thomas Biberger, Stephan Dieter Ewert:
Subjective and Objective Assessment of Monaural and Binaural Aspects of Audio Quality. 1112-1125 - Bolaji Yusuf, Batuhan Gündogdu, Murat Saraclar:
Low Resource Keyword Search With Synthesized Crosslingual Exemplars. 1126-1135 - Andreas I. Koutrouvelis, Richard C. Hendriks, Richard Heusdens, Jesper Jensen:
Robust Joint Estimation of Multimicrophone Signal Model Parameters. 1136-1150 - Benjamin Cauchi, Kai Siedenburg, João Felipe Santos, Tiago H. Falk, Simon Doclo, Stefan Goetze:
Non-Intrusive Speech Quality Prediction Using Modulation Energies and LSTM-Network. 1151-1163 - Yike Zhang, Pengyuan Zhang, Yonghong Yan:
Tailoring an Interpretable Neural Language Model. 1164-1178 - Ashutosh Pandey, DeLiang Wang:
A New Framework for CNN-Based Speech Enhancement in the Time Domain. 1179-1188 - Vikram C. M., Nagaraj Adiga, S. R. Mahadeva Prasanna:
Detection of Nasalized Voiced Stops in Cleft Palate Speech Using Epoch-Synchronous Features. 1189-1200 - Huaishao Luo, Tianrui Li, Bing Liu, Bin Wang, Herwig Unger:
Improving Aspect Term Extraction With Bidirectional Dependency Tree Representation. 1201-1212
Volume 27, Number 8, August 2019
- Teng Zhang, Ji Wu:
Constrained Learned Feature Extraction for Acoustic Scene Classification. 1216-1228 - Leonardo Gabrielli, Stefano Tomassetti, Stefano Squartini, Carlo Zinato, Stefano Guaiana:
A Multi-Stage Algorithm for Acoustic Physical Model Parameters Estimation. 1229-1240 - Bing Yang, Hong Liu, Cheng Pang, Xiaofei Li:
Multiple Sound Source Counting and Localization Based on TF-Wise Spatial Spectrum Clustering. 1241-1255 - Yi Luo, Nima Mesgarani:
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation. 1256-1266 - Achintya Kumar Sarkar, Zheng-Hua Tan, Hao Tang, Suwon Shon, James R. Glass:
Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification. 1267-1279 - Jiawen Chua, W. Bastiaan Kleijn:
A Low Latency Approach for Blind Source Separation. 1280-1294 - Chao Pan, Jingdong Chen, Jacob Benesty, Guangming Shi:
On the Design of Target Beampatterns for Differential Microphone Arrays. 1295-1307 - Aqil M. Azmi, Manal N. Almutery, Hatim A. Aboalsamh:
Real-Word Errors in Arabic Texts: A Better Algorithm for Detection and Correction. 1308-1320 - Mandy Korpusik, James R. Glass:
Deep Learning for Database Mapping and Asking Clarification Questions in Dialogue Systems. 1321-1334 - Junhyeong Pak, Jong Won Shin:
Sound Localization Based on Phase Difference Enhancement Using Deep Neural Networks. 1335-1345
Volume 27, Number 9, September 2019
- Randall Ali, Giuliano Bernardi, Toon van Waterschoot, Marc Moonen:
Methods of Extending a Generalized Sidelobe Canceller With External Microphones. 1349-1364 - Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud:
Multichannel Online Dereverberation Based on Spectral Magnitude Inverse Filtering. 1365-1377 - Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic, Kai Yu:
AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning. 1378-1391 - Luoqin Li, Jiabing Wang, Jichang Li, Qianli Ma, Jia Wei:
Relation Classification via Keyword-Attentive Sentence Mechanism and Synthetic Stimulation Loss. 1392-1404 - Martin Bo Møller, Jesper Kjær Nielsen, Efren Fernandez-Grande, Søren Krarup Olesen:
On the Influence of Transfer Function Noise on Sound Zone Control in a Room. 1405-1418 - Zhen Xu, Chengjie Sun, Yinong Long, Bingquan Liu, Baoxun Wang, Mingjiang Wang, Min Zhang, Xiaolong Wang:
Dynamic Working Memory for Context-Aware Response Generation. 1419-1431 - Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo:
ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder. 1432-1443 - Xie Chen, Xunying Liu, Yu Wang, Anton Ragni, Jeremy Heng Meng Wong, Mark J. F. Gales:
Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition. 1444-1454 - Rui Wang, Zhe Chen, Fuliang Yin:
DOA-Based Three-Dimensional Node Geometry Calibration in Acoustic Sensor Networks and Its Cramér-Rao Bound and Sensitivity Analysis. 1455-1468 - Chia-Hsuan Lee, Hung-yi Lee, Szu-Lin Wu, Chi-Liang Liu, Wei Fang, Juei-Yang Hsu, Bo-Hsiang Tseng:
Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD. 1469-1480 - Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Yu-Hsuan Wang, Chia-Hao Shen:
Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation. 1481-1493
Volume 27, Number 10, October 2019
- Pairui Li, Chuan Chen, Wujie Zheng, Yuetang Deng, Fanghua Ye, Zibin Zheng:
STD: An Automatic Evaluation Metric for Machine Translation Based on Word Embeddings. 1497-1506 - Jie Zhang, Richard Heusdens, Richard Christian Hendriks:
Relative Acoustic Transfer Function Estimation in Wireless Acoustic Sensor Networks. 1507-1519 - Jihwan Park, Joon-Hyuk Chang:
State-Space Microphone Array Nonlinear Acoustic Echo Cancellation Using Multi-Microphone Near-End Speech Covariance. 1520-1534 - Zhaojie Luo, Jinhui Chen, Tetsuya Takiguchi, Yasuo Ariki:
Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features. 1535-1548 - Hala As'ad, Martin Bouchard, Homayoun Kamkar-Parsi:
A Robust Target Linearly Constrained Minimum Variance Beamformer With Spatial Cues Preservation for Binaural Hearing Aids. 1549-1563 - Yijun Wang, Yingce Xia, Li Zhao, Jiang Bian, Tao Qin, Enhong Chen, Tie-Yan Liu:
Semi-Supervised Neural Machine Translation via Marginal Distribution Estimation. 1564-1576 - Arindam Jati, Panayiotis G. Georgiou:
Neural Predictive Coding Using Convolutional Neural Networks Toward Unsupervised Learning of Speaker Characteristics. 1577-1589 - Federico Fontana, Enrico Bozzo:
Newton-Raphson Solution of Nonlinear Delay-Free Loop Filter Networks. 1590-1600 - Naoki Makishima, Shinichi Mogami, Norihiro Takamune, Daichi Kitamura, Hayato Sumino, Shinnosuke Takamichi, Hiroshi Saruwatari, Nobutaka Ono:
Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation. 1601-1615 - Jeena J. Prakash, Hema A. Murthy:
Analysis of Inter-Pausal Units in Indian Languages and Its Application to Text-to-Speech Synthesis. 1616-1628 - Yunshi Lan, Shuohang Wang, Jing Jiang:
Knowledge Base Question Answering With a Matching-Aggregation Model and Question-Specific Contextual Relations. 1629-1638 - Xuefeng Bai, Hailong Cao, Kehai Chen, Tiejun Zhao:
A Bilingual Adversarial Autoencoder for Unsupervised Bilingual Lexicon Induction. 1639-1648 - Guanlong Zhao, Ricardo Gutierrez-Osuna:
Using Phonetic Posteriorgram Based Frame Pairing for Segmental Accent Conversion. 1649-1660
Volume 27, Number 11, November 2019
- Zhuosheng Zhang, Hai Zhao, Kangwei Ling, Jiangtong Li, Zuchao Li, Shexia He, Guohong Fu:
Effective Subword Segmentation for Text Comprehension. 1664-1674 - Yue Xie, Ruiyu Liang, Zhenlin Liang, Chengwei Huang, Cairong Zou, Björn W. Schuller:
Speech Emotion Classification Using Attention-Based LSTM. 1675-1685 - Shuai Wang, Zili Huang, Yanmin Qian, Kai Yu:
Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification. 1686-1696 - Rui Lu, Zhiyao Duan, Changshui Zhang:
Audio-Visual Deep Clustering for Speech Separation. 1697-1712 - Tetiana Parshakova, François Rameau, Andriy Serdega, In So Kweon, Dae-Shik Kim:
Latent Question Interpretation Through Variational Adaptation. 1713-1724 - Jeremy Heng Meng Wong, Mark John Francis Gales, Yu Wang:
General Sequence Teacher-Student Learning. 1725-1736 - Liming Shi, Jesper Kjær Nielsen, Jesper Rindom Jensen, Max A. Little, Mads Græsbøll Christensen:
Robust Bayesian Pitch Tracking Based on the Harmonic Model. 1737-1751 - Yan Yang, Changchun Bao:
RS-CAE-Based AR-Wiener Filtering and Harmonic Recovery for Speech Enhancement. 1752-1762 - Alberto Bernardini, Paolo Maffezzoni, Augusto Sarti:
Linear Multistep Discretization Methods With Variable Step-Size in Nonlinear Wave Digital Structures for Virtual Analog Modeling. 1763-1776 - Dong Deng, Liping Jing, Jian Yu, Shaolong Sun:
Sparse Self-Attention LSTM for Sentiment Lexicon Construction. 1777-1790 - Qiuqiang Kong, Changsong Yu, Yong Xu, Turab Iqbal, Wenwu Wang, Mark D. Plumbley:
Weakly Labelled AudioSet Tagging With Attention Neural Networks. 1791-1802 - Samy Elshamy, Tim Fingscheidt:
DNN-Based Cepstral Excitation Manipulation for Speech Enhancement. 1803-1814 - Nooshin Maghsoodi, Hossein Sameti, Hossein Zeinali, Themos Stafylakis:
Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors. 1815-1825 - Sining Sun, Pengcheng Guo, Lei Xie, Mei-Yuh Hwang:
Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition. 1826-1838 - Masood Delfarah, DeLiang Wang:
Deep Learning for Talker-Dependent Reverberant Speaker Separation: An Empirical Study. 1839-1848
Volume 27, Number 12, December 2019
- Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari:
Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method. 1852-1867 - Lijun Wu, Xu Tan, Tao Qin, Jianhuang Lai, Tie-Yan Liu:
Beyond Error Propagation: Language Branching Also Affects the Accuracy of Sequence Generation. 1868-1879 - Amit Das, Jinyu Li, Guoli Ye, Rui Zhao, Yifan Gong:
Advancing Acoustic-to-Word CTC Model With Attention and Mixed-Units. 1880-1892 - Niccolò Antonello, Enzo De Sena, Marc Moonen, Patrick A. Naylor, Toon van Waterschoot:
Joint Acoustic Localization and Dereverberation Through Plane Wave Decomposition and Sparse Regularization. 1893-1905 - Federico Borra, Alberto Bernardini, Fabio Antonacci, Augusto Sarti:
Uniform Linear Arrays of First-Order Steerable Differential Microphones. 1906-1918 - Li Chai, Jun Du, Qing-Feng Liu, Chin-Hui Lee:
Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement. 1919-1931 - Jun Qi, Jun Du, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement. 1932-1943 - Xudong Dang, Qi Cheng, Hongyan Zhu:
Indoor Multiple Sound Source Localization via Multi-Dimensional Assignment Data Association. 1944-1956 - Martin Schneider, Emanuël A. P. Habets:
Iterative DFT-Domain Inverse Filter Optimization Using a Weighted Least-Squares Criterion. 1957-1969 - Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao:
Neural Machine Translation With Sentence-Level Topic Context. 1970-1984 - Alejandro Gómez Alanís, Antonio M. Peinado, José A. González, Angel M. Gomez:
A Gated Recurrent Convolutional Neural Network for Robust Spoofing Detection. 1985-1999 - Siyuan Feng, Tan Lee:
Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling. 2000-2011 - Wei Li, Nancy F. Chen, Sabato Marco Siniscalchi, Chin-Hui Lee:
Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models. 2012-2024 - Quansheng Tu, Huawei Chen:
On Mainlobe Orientation of the First- and Second-Order Differential Microphone Arrays. 2025-2040 - Jan Chorowski, Ron J. Weiss, Samy Bengio, Aäron van den Oord:
Unsupervised Speech Representation Learning Using WaveNet Autoencoders. 2041-2053 - Vishnuvardhan Varanasi, Ayushya Agarwal, Rajesh M. Hegde:
Near-Field Acoustic Source Localization Using Spherical Harmonic Features. 2054-2066 - Yibin Zheng, Jianhua Tao, Zhengqi Wen, Jiangyan Yi:
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS. 2067-2079 - Yanhui Tu, Jun Du, Chin-Hui Lee:
Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition. 2080-2091 - Yuzhou Liu, DeLiang Wang:
Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation. 2092-2102 - Xuebo Liu, Derek F. Wong, Lidia S. Chao, Yang Liu:
Latent Attribute Based Hierarchical Decoder for Neural Machine Translation. 2103-2112 - Jingyi Hu, Ning Chen:
Enhanced Feature Summarizing for Effective Cover Song Identification. 2113-2126 - Qianli Ma, Liuhong Yu, Shuai Tian, Enhuan Chen, Wing W. Y. Ng:
Global-Local Mutual Attention Model for Text Classification. 2127-2139 - Vesa Välimäki, Jussi Rämö:
Neurally Controlled Graphic Equalizer. 2140-2149 - Sean U. N. Wood, Johannes Stahl, Pejman Mowlaee:
Binaural Codebook-Based Speech Enhancement With Atomic Speech Presence Probability. 2150-2161 - Lukas Pfeifenberger, Matthias Zöhrer, Franz Pernkopf:
Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement. 2162-2172 - Marc Arnela, Saeed Dabbaghchian, Oriol Guasch, Olov Engwall:
MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs. 2173-2182 - Varun Srivastava, Mayank Mishra:
Adversarial Approximate Inference for Speech to Electroglottograph Conversion. 2183-2196 - Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara:
Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior. 2197-2212 - Qipeng Guo, Xipeng Qiu, Xiangyang Xue, Zheng Zhang:
Low-Rank and Locality Constrained Self-Attention for Sequence Modeling. 2213-2222 - Jun Yu, Qiang Ling, Changwei Luo, Chang Wen Chen:
Synthesizing 3D Trump: Predicting and Visualizing the Relationship Between Text, Speech, and Articulatory Movements. 2223-2233 - Ryosuke Sugiura, Yutaka Kamamoto, Takehiro Moriya:
Shape Control of Discrete Generalized Gaussian Distributions for Frequency-Domain Audio Coding. 2234-2248 - Zamir Ben-Hur, David Lou Alon, Ravish Mehra, Boaz Rafaely:
Efficient Representation and Sparse Sampling of Head-Related Transfer Functions Using Phase-Correction Based on Ear Alignment. 2249-2262 - Luca Remaggi, Philip J. B. Jackson, Wenwu Wang:
Modeling the Comb Filter Effect and Interaural Coherence for Binaural Source Separation. 2263-2277 - Biao Zhang, Deyi Xiong, Jinsong Su, Jiebo Luo:
Future-Aware Knowledge Distillation for Neural Machine Translation. 2278-2287 - Randall Ali, Toon van Waterschoot, Marc Moonen:
Integration of a Priori and Estimated Constraints Into an MVDR Beamformer for Speech Enhancement. 2288-2300 - Nitya Tiwari, Prem C. Pandey:
Speech Enhancement Using Noise Estimation With Dynamic Quantile Tracking. 2301-2312 - Junwen Duan, Xiao Ding, Yue Zhang, Ting Liu:
TEND: A Target-Dependent Representation Learning Framework for News Document. 2313-2325 - Lujun Zhao, Xipeng Qiu, Qi Zhang, Xuanjing Huang:
Sequence Labeling With Deep Gated Dual Path CNN. 2326-2335 - Akihiro Kato, Tomi H. Kinnunen:
Statistical Regression Models for Noise Robust F0 Estimation Using Recurrent Deep Neural Networks. 2336-2349 - Dayiheng Liu, Jie Fu, Qian Qu, Jiancheng Lv:
BFGAN: Backward and Forward Generative Adversarial Networks for Lexically Constrained Sentence Generation. 2350-2361 - Andrés Marafioti, Nathanaël Perraudin, Nicki Holighaus, Piotr Majdak:
A Context Encoder For Audio Inpainting. 2362-2372 - Jichen Yang, Rohan Kumar Das, Nina Zhou:
Extraction of Octave Spectra Information for Spoofing Attack Detection. 2373-2384 - Oren Barkan, David Tsiris, Ori Katz, Noam Koenigstein:
InverSynth: Deep Estimation of Synthesizer Parameter Configurations From Audio Signals. 2385-2396
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.