Liang Lu
I am now a Senior Applied Scientist at Microsoft. I work on deep learning for speech recognition. Previously, I was a Research Assistant Professor at the Toyota Technological Institute at Chicago, a philanthropically endowed academic computer science institute located at the University of Chicago campus. I was part of the Speech and Language Group at TTIC. I received the Ph.D degree from The Centre for Speech Technology Research at The University of Edinburgh in 2013. During my Ph.D, I was also a Marie Curie Fellow within the EU SCALE research programme. I then worked as a postdoctoral research associate until 2016 on the EPSRC Natural Speech Technology project - a collaboration between University of Edinburgh, University of Sheffield and Cambridge University.
Background
- 2017 - present Senior Applied Scientist at Microsoft, Bellevue, WA, USA
- 2016 - 2017 Research Assistant Professor at the Toyota Technological Institute at Chicago, Chicago, USA
- 2015 Jul - Aug Senior team member at Jelinek Workshop on Speech and Language Technology, University of Washington, USA
- 2013 - 2016 Postdoctoral Research Associate, The University of Edinburgh, Edinburgh, UK
- 2009 - 2013 Ph.D student, The University of Edinburgh, Edinburgh, UK
- 2010 Jun - Sep Research Intern at Toshiba Research Lab at Cambridge, UK
Thesis
- Liang Lu, "Subspace Gaussian Mixture Models for Automatic Speech Recognition", Ph.D dissertation, The University of Edinburgh, 2013. (pdf)
2019
- Liang Lu, "A transformer with interleaved self-attention and convolution for hybrid acoustic models", submitted to ICASSP 2020 (pdf)
- Peidong Wang, et al., "Speech separation using speaker inventory", ASRU 2019 (to appear)
- Liang Lu, Xiong Xiao, Zhuo Chen, Yifan Gong, "PyKaldi2: Yet another speech toolkit based on Kaldi and PyTorch", arxiv 2019 (pdf, repo )
- Liang Lu, Eric Sun, Yifan Gong, "Self-teaching networks", Interspeech, 2019 (pdf)
- Jinyu Li, Liang Lu, Chiangliang Liu, Yifan Gong, "Improving layer trajectory LSTM with future context frames", ICASSP, 2019
2018
- Jinyu Li, Liang Lu, Changliang Liu, Yifan Gong, "Exploring Layer Trajectory LSTM With Depth Processing Units and Attention", SLT, 2018
- Kalpesh Krishna, Liang Lu, Kevin Gimpel, Karen Livescu, "A Study of All-Convolutional Encoders for Connectionist Temporal Classification", ICASSP, 2018 (pdf)
2017
- Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Micheal Mandel, Liang Lu, John R. Hershey, Micheal L. Seltzer, Guoguo Chen, Yu Zhang, Dong Yu, "Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition", In: Watanabe S., Delcroix M., Metze F., Hershey J. (eds) New Era for Robust Speech Recognition. Springer, 2017 (pdf).
- Liang Lu, "Toward Computation and Memory Efficient Neural Network Acoustic Models with Binary Weights and Activations", arXiv, 2017 (pdf)
- Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals, "End-to-End Neural Segmental Models for Speech Recognition", IEEE Journal of Selected Topics in Signal Processing, 2017 (pdf)
- Shubham Toshniwal, Hao Tang, Liang Lu and Karen Livescu, "Multitask Learning with Low-Level Auxiliary Tasks for Encoder-Decoder Based Speech Recognition", Interspeech 2017 (pdf)
- Liang Lu, Lingpeng Kong, Chris Dyer and Noah A. Smith, "Multitask Learning with CTC and Segmental CRF for Speech Recognition", Interspeech 2017 (pdf)
- Liang Lu, Michelle Guo and Steve Renals, "Knowledge Distillation for Small-footprint Highway Networks", ICASSP 2017 (pdf)
- Yin Xian, Yunchen Pu, Zhe Gan, Liang Lu, Andrew Thompson, "Adaptive DCTNet for Audio Signal Classification", ICASSP 2017 (pdf)
- Ben Krause, Liang Lu, Iain Murray, Steve Renals, "Multiplicative LSTM for Sequence Modelling", ICLR 2017, workshop track (pdf)
- Liang Lu and Steve Renals, "Small-footprint Highway Deep Neural Networks for Speech Recognition", IEEE/ACM Transactions on Audio, Speech and Language Processing, 2017 (pdf)
2016
- Liang Lu, "Sequence Training and Adaptation of Highway Deep Neural Networks", SLT 2016 (pdf)
- Liang Lu*, Lingpeng Kong*, Chris Dyer, Noah A. Smith, and Steve Renals, "Segmental Recurrent Neural Networks for End-to-end Speech Recognition", Interspeech 2016 (pdf, slides)
- Liang Lu and Steve Renals, "Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition", Interspeech 2016, (pdf, slides)
- Xingxing Zhang, Liang Lu and Mirella Lapata, "Top down Tree Long Short Term Memory Networks", Proc. NAACL-HLT, 2016 (pdf code)
- Liang Lu, Xingxing Zhang and Steve Renals, "On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition", Proc. ICASSP , 2016 (pdf slides)
- Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Liang Lu, John Hershey, Mike Seltzer, Guoguo Chen, Yu Zhang, Michael Mandel and Dong Yu, "Deep Beamforming Networks for Multi-channel Speech Recognition", Proc. ICASSP, 2016 (pdf)
- Tian Tan, Yanmin Qian, Dong Yu, Souvik Kundu, Liang Lu, Khe Chai SIM, Xiong Xiao, and Yu Zhang, "Speaker-aware training of LSTM-RNNs for acoustic modelling", Proc. ICASSP, 2016 (pdf)
2015
- Ben Krause, Liang Lu, Iain Murray and Steve Renals, "On the Efficiency of Recurrent Neural Network Optimization Algorithms", NIPS Optimization for Machine Learning Workshop, 2015 (pdf)
- Liang Lu, Xingxing Zhang, Kyung Hyun Cho and Steve Renals, "A Study of the Recurrent Neural Network Encoder-Decoder for Large Vocabulary Speech Recognition", Proc. InterSpeech, 2015 (pdf )
- Liang Lu and Steve Renals, "Feature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models", Proc. InterSpeech, 2015 (pdf)
- Liang Lu and Steve Renals, "Multi-frame factorisation for long-span acoustic modelling", Proc. ICASSP, 2015 (pdf)
2014
- Liang Lu and Steve Renals, "Probabilistic linear discriminant analysis for acoustic modelling", in IEEE Signal Processing Letters, 2014 (pdf)
- Liang Lu, Arnab Ghoshal and Steve Renals, "Cross-lingual Subspace Gaussian Mixture Models for Low-resource Speech Recognition", IEEE/ACM Transactions on Audio, Speech and Language Processing, 2014. (pdf)
- Liang Lu and Steve Renals, "Tied Probabilistic linear discriminant analysis for speech recognition", arXiv:1411.0895 [cs.CL], 2014 (pdf)
- Liang Lu, Steve Renals, "Probabilistic linear discriminant analysis with bottleneck features for speech recognition", in Proc. InterSpeech 2014 (pdf)
2013
- Liang Lu, K.K. Chin, Arnab Ghoshal and Steve Renals, "Joint Uncertainty Decoding for Noise Robust Subspace Gaussian Mixture Models", IEEE Transactions on Audio, Speech and Language Processing, 2013. (pdf )
- Liang Lu, Arnab Ghoshal, Steve Renals, "Acoustic Data-driven Pronunciation Lexicon for Large Vocabulary Speech Recognition", in Proc. ASRU (pdf, poster) 2013 (Best Paper Award).
- Liang Lu, Arnab Ghoshal, Steve Renals, "Noise adaptive training for subspace Gaussian mixture models", in Proc. InterSpeech (pdf), 2013
2012
- Cheng-yu Yang, Georgina Brown, Liang Lu, Junichi Yamagishi and Simon King, "Noise-Robust Whispered Speech Recognition Using A Non-Audible-Murmur Microphone With VTS Compensation", in Proc. ISCSLP , 2012
- Liang Lu, Arnab Ghashal and Steve Renals, "Joint Uncertainty Decoding with Unscent Transform for Noise Robust Subspace Gaussian Mixture Models", in Proc. SAPA-SCALE workshop , 2012
- Liang Lu, KK Chin, Arnab Ghoshal and Steve Renals, "Noise Compensation for Subspace Gaussian Mixture Models", in InterSpeech , 2012
- Liang Lu, Arnab Ghoshal and Steve Renals, "Maximum A Posteriori Adaptation of Subspace Gaussian Mixture Models for Cross-lingual Speech Recognition", in Proc. ICASSP, 2012.
2011
- Liang Lu, Arnab Ghoshal and Steve Renals, "Regularized Subspace Gausian Mixture Models for Speech Recognition", in IEEE Signal Processing Letters, 2011.
- Liang Lu, Arnab Ghoshal and Steve Renals, "Regularized Subspace Gausian Mixture Models for Cross-lingual Speech Recognition", in Proc. ASRU, 2011.
2010
- Kenichi Kumatani, Liang Lu, John McDonough, Arnab Ghoshal, and Dietrich Klakow, "Maximum Negentropy Beamforming with Superdirectivity", in Proc. EUSIPCO, 2010.
2009
- Liang Lu, Yuan Dong, Xianyu Zhao, Jiqing Liu, Haila Wang, " The Effect of Language Factors for Robust Speaker Recognition ", in Proc. ICASSP,2009.
- Yuan Dong, Liang Lu, Xianyu Zhao, Jian Zhao, "Studies on Model Distance Normalization Approach in Text-Independent Speaker Verification " in Journal of Acta Automatica Sinica (Chinese journal), 2009.
- Xianyu Zhao, Yuan Dong, Jian Zhao, Liang Lu, Jiqing Liu, Haila Wang, "Variational Bayesian Joint Factor Analysis for Speaker Verification",in Proc. ICASSP, 2009.
2008
- Liang Lu, Yuan Dong, Xianyu Zhao, Jian Zhao, Chengyu Dong, Haila Wang, "Analysis of Subspace Within-Class Covariance Normalization for SVM-based Speaker Verification ", in Proc. InterSpeech, 2008.
- Xianyu Zhao, Yuan Dong, Jian Zhao, Liang Lu, Jiqing Liu, Haila Wang, "Comparision of Input and Feature Space Nonlinear Kernel Nuisance Attribute Projections for Speaker Verification", in Proc. InterSpeech, 2008.
- Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang, "Nonlinear Kernel Nuisance Attribute Projection for Speaker Verification ",in Proc. ICASSP, 2008.
- Liang Lu, Yuan Dong, Xianyu Zhao, Hao Yang, Jian Zhao, Haila Wang, "Component Score Weighting for GMM based Text-Independent Speaker Verification ", in Proc. Odyssey, The Speaker and Language Recognition Workshop, 2008
- Yuan Dong, Jian Zhao, Liang Lu, Jiqing Lui, Xianyu Zhao, Haila Wang, " Eigenchannel Compensation and Symmetric Score for Robust Text-Independent Speaker Verification", in Proc. ISCSLP, 2008
2007
- Xianyu Zhao, Yuan Dong, Hao Yang, Jian Zhao, Liang Lu, Haila Wang, "Comparision of Two Kinds of Speaker Location Representation for SVM-based Speaker Verification ", in Proc. InterSpeech, 2007.
- Hao Yang, Yuan Dong, Xianyu Zhao, Jian Zhao, Liang Lu, Haila Wang, "Cluster Adaptive Training Weights as Features in SVM-Based Speaker Verification ", in Proc. InterSpeech, 2007.
Selected talks
- "Segmental Recurrent Neural Networks for End-to-End Speech Recognition", invited talk at MIT, Feb 2017 (slides)
- "Neural Segmental CRFs for Sequence Modelling", talk at NST final meeting, June 2016 (slides)
- "Deep learning for end-to-end speech recognition", job talk at TTI-Chicago, March 2016 (slides)
- "On Training the Recurrent Neural Network Encoder-Decoder for Large Vocabulary End-to-end Speech Recognition", in ICASSP, March 2016 (slides)
- "Recent advances in automatic speech recognition --- A brief overview", invited talk at the Heriot-Watt Univeristy, 2014 (slides)
- "Probabilistic linear discriminant analysis with bottleneck features for speech recognition", in Interspeech, Singapore, 2014 (slides)
- "Noise adaptive training for subspace Gaussian mixture models", in Interspeech, Francce, 2013 (slides)
- "Noise Compensation for Subspace Gaussian Mixture Models", in Interspeech, USA, 2012 (slides)
- "Joint Uncertainty Decoding with Unscent Transform for Noise Robust Subspace Gaussian Mixture Models", in SAPA-SCALE workshop, USA, 2012 (slides)
Software
- PyKaldi2 speech toolkit based on Kaldi and PyTorch (repo)
- Segmental RNN based on Dynet link
- Open-source toolkits for beamforming and distant speech recognition BTK
- The recipe and source code for tied-PLDA based acoustic modelling: link (out of date)
- The recipe and source code for cross-lingual SGMM are in Kaldi (out of date)
- An implementation of L1 regularisation of a qudratic objective function is released in Kaldi (link) (out of date)
Misc
- Reviewer:
- IEEE Transactions on Audio, Speech and Language Processing
- IEEE Signal Processing Letters
- IEEE Transactions on Cybernetics
- IEEE Journal of Selected Topics in Signal Processing
- Computer Speech and Language
- Speech Communication
- EURASIP Journal on Audio, Speech, and Music Processing
- IEEE Pattern Analysis and Machine Intelligence
- Journal of Artificial Intelligence Research
- ICASSP, Interspeech, ASRU, NIPS, ICLR, ICML
- Member: IEEE, ISCA
- Session chair: ICASSP 2017
- Organizer: MLSD 2017; MLSLP 2017; MLSLP 2018
- Reports on my work: IEEE SLTC newsletter 2014, 2016
Personal