Alex Graves
I'm a CIFAR Junior Fellow supervised by Geoffrey Hinton in the Department of Computer Science at the University of Toronto.
email: [email protected].
Research Interests
- Recurrent neural networks (especially LSTM)
- Supervised sequence labelling (especially speech and handwriting recognition)
- Unsupervised sequence learning
Demos
Publications
- A. Graves, N. Jaitly, A. Mohamed. Hybrid Speech Recognition with Deep Bidirectional LSTM. ASRU 2013, Olomouc, Czech Republic. [pdf, poster]
- A. Graves. Generating Sequences With Recurrent Neural Networks [arxiv, slides, demo]
- A. Graves, A. Mohamed, G. Hinton. Speech Recognition with Deep Recurrent Neural Networks. ICASSP 2013, Vancouver, Canada. [pdf]
- A. Graves. Sequence Transduction with Recurrent Neural Networks. Representation Learning Worksop, ICML 2012, Edinburgh, Scotland. [arxiv, talk,slides]
- A. Graves. Supervised Sequence Labelling with Recurrent Neural Networks. Textbook, Studies in Computational Intelligence, Springer, 2012. [web, preprint]
- A. Graves. Offline Arabic Handwriting Recognition with Multidimensional Neural Networks. Book Chapter, Guide to OCR for Arabic Scripts, Springer, 2012. [pdf]
- M. Liwicki, A. Graves, and H. Bunke. Neural Networks for Handwriting Recognition. Book chapter, Computational Intelligence Paradigms in Advanced Pattern Classification, pp. 5--24, Springer, 2012
- A. Graves. Practical Variational Inference for Neural Networks. NIPS 2011, pp 545-552. [pdf, poster, spotlight]
- J. Schmidhuber, D. Ciresan, U. Meier, J. Masci and A. Graves. On fast deep nets for AGI vision, AG1 2011, pp. 243-246.
- F. Eyben, S. Böck, B. Schuller and A. Graves. Universal Onset Detection with Bidirectional Long-Short Term Memory Neural Networks. 11th Intern. Soc. for Music Information Retrieval Conference, ISMIR, Utrecht, Holland, pp. 589-594, August 2010.
- M. Felder, A. Kaifel and A. Graves. Wind Power Prediction Using Mixture Density Recurrent Neural Networks. European Wind Energy Conference and Exhibition, April 2010.
- F. Sehnke, C. Osendorfer, T. Rückstieß, A. Graves, J. Peters and J. Schmidhuber. Parameter-exploring policy gradients. Neural Networks, 23(2), March 2010.
- F. Sehnke, A. Graves, C. Osendorfer and J. Schmidhuber. Multimodal parameter-exploring policy gradients. ICMLA 2010, pp. 113-118.
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller and G. Rigoll. Non-Linear Speech Processing, chapter, Improving Keyword Spotting with a Tandem BLSTM-DBN Architecture, pp. 68-75. Springer Heidelberg, LNAI 5933, 2010.
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller and G. Rigoll. Bidirectional LSTM networks for context-sensitive keyword detection in a cognitive virtual agent framework. Cognitive Computation, Special Issue on Non-Linear and Non-Conventional Speech Processing, 2010.
- M. Wöllmer, F. Eyben, J. Keshet, A. Graves, B. Schuller and G. Rigoll. Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks. ICASSP 2009, Taipei, Taiwan, pp. 3949-3952.
- F. Eyben, M. Wöllmer, B. Schuller and A. Graves. From speech to letters - using a novel neural network architecture for grapheme based ASR. IEEE Automatic Speech Recognition and Understanding Workshop, pp. 376-380, Merano, Italy, 2009.
- F. Eyben, M. Wöllmer, A. Graves, B. Schuller, E. Douglas-Cowie and R. Cowie. On-line emotion recognition in a 3-d activation-valence-time continuum using acoustic and linguistic cues. Journal on Multimodal User Interfaces (JMUI), Special Issue on Real-time Affect Analysis and Interpretation: Closing the Loop in Virtual Agents, 3:7-19, 2009.
- A. Graves, M. Liwicki, S. Fernández, R. Bertolami, H. Bunke, and J. Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 31 Issue 5, May 2009 pp. 855-868. [pdf]
- M. Wöllmer, F. Eyben, A. Graves, B. Schuller and G. Rigoll. A tandem BLSTM-DBN architecture for keyword spotting with enhanced context modeling. NOLISP 2009, ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, Vic, Spain, 2009.
- A. Graves and J. Schmidhuber. Offline handwriting recognition with multidimensional recurrent neural networks. NIPS 2008, Vancouver, Canada, pp. 545-552. [pdf, poster, spotlight]
- A. Graves, C. Mayer, M. Wimmer, J. Schmidhuber, and B. Radig. Facial expression recognition with recurrent neural networks. International Workshop on Cognition for Technical Systems, Munich, Germany, October 2008. [pdf]
- A. Graves. Supervised Sequence Labelling with Recurrent Neural Networks. Dissertation, Technische Universität München, München, July 2008. See extended book version (above) for pdf.
- S. Fernández, A. Graves, and J. Schmidhuber. Phoneme recognition in TIMIT with BLSTM-CTC. [arXiv]
- F. Sehnke, C. Osendorfer, T. Rückstieß, A. Graves, J. Peters, and J. Schmidhuber. Policy gradients with parameter-based exploration for control. ICANN 2008, Prague, Czech Republic, pp. 387-396.
- A. Graves, S. Fernández, M. Liwicki, H. Bunke and J. Schmidhuber. Unconstrained online handwriting recognition with recurrent neural networks. NIPS 2007, Vancouver, Canada. [pdf, poster, spotlight]
- A. Graves, S. Fernández, J. Schmidhuber. Multi-Dimensional Recurrent Neural Networks. ICANN 2007, Porto, Portugal, pp. 549-558. [pdf]
- S. Fernández, A. Graves, J. Schmidhuber. An application of recurrent neural networks to discriminative keyword spotting. ICANN 2007, Porto, Portugal, pp. 220-229. [pdf]
- M. Liwicki, A. Graves, S. Fernández, H. Bunke, J. Schmidhuber. A Novel Approach to On-Line Handwriting Recognition Based on Bidirectional Long Short-Term Memory Networks. ICDAR 2007, Curitiba, Brazil, pp. 367-371. [pdf]
- A. Förster, A. Graves, and J. Schmidhuber. RNN-based Learning of Compact Maps for Efficient Robot Localization. ESANN 2007, Bruges, Belgium, pp. 537-542. [pdf]
- S. Fernández, A. Graves, J. Schmidhuber. Sequence labelling in structured domains with hierarchical recurrent neural networks. IJCAI 2007, Hyderabad, India, pp. 774-779. [pdf]
- A. Graves, S. Fernández, F. Gomez, J. Schmidhuber. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. ICML 2006, Pittsburgh, USA, pp. 369-376. [pdf]
- A. Graves, J. Schmidhuber. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Networks, Volume 18:5-6, pp. 602-610, 2005. [pdf]
- A. Graves, S. Fernández, J. Schmidhuber. Bidirectional LSTM Networks for Improved Phoneme Classification and Recognition. ICANN 2005, Warsaw, Poland, pp. 799-804. [pdf]
- N. Beringer, A. Graves, F. Schiel, J. Schmidhuber. Classifying unprompted speech by retraining LSTM Nets. ICANN 2005, Warsaw, Poland, pp. 575-581.
- A. Graves, J. Schmidhuber. Framewise Phoneme Classification with Bidirectional LSTM Networks. IJCNN 2005, Montreal, Canada, pp. 2047-2052. [pdf]
- A. Graves, N. Beringer, J. Schmidhuber. A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition. NCI 2004, Grindelwald, Switzerland, pp. 164-168.
- A. Graves, D. Eck, N. Beringer, J. Schmidhuber. Biologically Plausible Speech Recognition with LSTM Neural Nets. Bio-ADIT 2004, Lausanne, Switzerland, pp. 175-184.