


default search action
Rafael Valle
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i28]Zhifeng Kong, Kevin J. Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, João Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro:
A2SB: Audio-to-Audio Schrodinger Bridges. CoRR abs/2501.11311 (2025) - [i27]Shehzeen Hussain, Paarth Neekhara, Xuesong Yang, Edresson Casanova, Subhankar Ghosh, Mikyas T. Desta, Roy Fejgin, Rafael Valle, Jason Li:
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance. CoRR abs/2502.05236 (2025) - 2024
- [j5]Nicollas Rodrigues de Oliveira
, Yago de Rezende dos Santos, Ana Carolina Rocha Mendes, Guilherme Nunes Nasseh Barbosa
, Marcela Tuler de Oliveira
, Rafael Valle, Dianne Scherly Varela de Medeiros
, Diogo M. F. Mattos
:
Storage Standards and Solutions, Data Storage, Sharing, and Structuring in Digital Health: A Brazilian Case Study. Inf. 15(1): 20 (2024) - [c21]Akshit Arora, Rohan Badlani, Sungwon Kim, Rafael Valle, Bryan Catanzaro:
Scaling Nvidia's Multi-Speaker Multi-Lingual TTS Systems With Zero-Shot TTS to Indic Languages. ICASSP Workshops 2024: 115-116 - [c20]Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro:
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities. ICML 2024 - [c19]Paarth Neekhara, Shehzeen Samarah Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian J. McAuley:
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations. ICML 2024 - [c18]Shuqi Dai
, Ming-Yu Liu
, Rafael Valle
, Siddharth Gururani
:
ExpressiveSinger: Multilingual and Multi-Style Score-based Singing Voice Synthesis with Expressive Performance Control. ACM Multimedia 2024: 3229-3238 - [i26]Akshit Arora, Rohan Badlani, Sungwon Kim, Rafael Valle, Bryan Catanzaro:
Scaling NVIDIA's Multi-speaker Multi-lingual TTS Systems with Zero-Shot TTS to Indic Languages. CoRR abs/2401.13851 (2024) - [i25]Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro:
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities. CoRR abs/2402.01831 (2024) - [i24]Arushi Goel, Zhifeng Kong, Rafael Valle, Bryan Catanzaro:
Audio Dialogues: Dialogues dataset for audio and music understanding. CoRR abs/2404.07616 (2024) - [i23]Zhifeng Kong, Sang-gil Lee, Deepanway Ghosal, Navonil Majumder, Ambuj Mehrish, Rafael Valle, Soujanya Poria, Bryan Catanzaro:
Improving Text-To-Audio Models with Synthetic Captions. CoRR abs/2406.15487 (2024) - [i22]Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg:
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment. CoRR abs/2406.17957 (2024) - [i21]Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, Dinesh Manocha:
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data. CoRR abs/2410.02056 (2024) - [i20]Arushi Goel, Karan Sapra, Matthieu Le, Rafael Valle, Andrew Tao, Bryan Catanzaro:
OMCAT: Omni Context Aware Transformer. CoRR abs/2410.12109 (2024) - [i19]Sang-gil Lee, Zhifeng Kong, Arushi Goel, Sungwon Kim, Rafael Valle, Bryan Catanzaro:
ETTA: Elucidating the Design Space of Text-to-Audio Models. CoRR abs/2412.19351 (2024) - [i18]Chia-Yu Hung, Navonil Majumder, Zhifeng Kong, Ambuj Mehrish, Rafael Valle, Bryan Catanzaro, Soujanya Poria:
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization. CoRR abs/2412.21037 (2024) - 2023
- [c17]Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro:
Vani: Very-Lightweight Accent-Controllable TTS for Native And Non-Native Speakers With Identity Preservation. ICASSP 2023: 1-2 - [c16]Sudheer Kovela, Rafael Valle, Ambrish Dantrey, Bryan Catanzaro:
Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning. ICASSP 2023: 1-5 - [c15]Rafael Valle, João Felipe Santos, Kevin J. Shih, Rohan Badlani, Bryan Catanzaro:
High-Acoustic Fidelity Text To Speech Synthesis With Fine-Grained Control Of Speech Attributes. ICASSP 2023: 1-5 - [c14]Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu:
SPACE: Speech-driven Portrait Animation with Controllable Expression. ICCV 2023: 20857-20866 - [c13]Rohan Badlani, Rafael Valle, Kevin J. Shih, João Felipe Santos, Siddharth Gururani, Bryan Catanzaro:
RAD-MMM: Multilingual Multiaccented Multispeaker Text To Speech. INTERSPEECH 2023: 626-630 - [c12]Sungwon Kim, Kevin J. Shih, Rohan Badlani, João Felipe Santos, Evelina Bakhturina, Mikyas Desta, Rafael Valle, Sungroh Yoon, Bryan Catanzaro:
P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting. NeurIPS 2023 - [i17]Rohan Badlani, Rafael Valle, Kevin J. Shih, João Felipe Santos, Siddharth Gururani, Bryan Catanzaro:
Multilingual Multiaccented Multispeaker TTS with RADTTS. CoRR abs/2301.10335 (2023) - [i16]Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro:
VANI: Very-lightweight Accent-controllable TTS for Native and Non-native speakers with Identity Preservation. CoRR abs/2303.07578 (2023) - [i15]Paarth Neekhara, Shehzeen Hussain, Rafael Valle, Boris Ginsburg, Rishabh Ranjan, Shlomo Dubnov, Farinaz Koushanfar, Julian J. McAuley
:
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations. CoRR abs/2310.09653 (2023) - 2022
- [c11]Rohan Badlani, Adrian Lancucki, Kevin J. Shih, Rafael Valle, Wei Ping, Bryan Catanzaro:
One TTS Alignment to Rule Them All. ICASSP 2022: 6092-6096 - [i14]Kevin J. Shih, Rafael Valle, Rohan Badlani, João Felipe Santos, Bryan Catanzaro:
Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows. CoRR abs/2203.01786 (2022) - [i13]Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu:
SPACEx: Speech-driven Portrait Animation with Controllable Expression. CoRR abs/2211.09809 (2022) - 2021
- [j4]Jason Poulos, Rafael Valle:
Character-based handwritten text transcription with attention networks. Neural Comput. Appl. 33(16): 10563-10573 (2021) - [c10]Rafael Valle, Kevin J. Shih, Ryan Prenger, Bryan Catanzaro:
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis. ICLR 2021 - [i12]Rohan Badlani, Adrian Lancucki, Kevin J. Shih, Rafael Valle, Wei Ping, Bryan Catanzaro:
One TTS Alignment To Rule Them All. CoRR abs/2108.10447 (2021) - 2020
- [c9]Rafael Valle, Jason Li, Ryan Prenger, Bryan Catanzaro:
Mellotron: Multispeaker Expressive Voice Synthesis by Conditioning on Rhythm, Pitch and Global Style Tokens. ICASSP 2020: 6189-6193 - [i11]Rafael Valle, Kevin J. Shih, Ryan Prenger, Bryan Catanzaro:
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis. CoRR abs/2005.05957 (2020)
2010 – 2019
- 2019
- [c8]Ryan Prenger, Rafael Valle, Bryan Catanzaro:
Waveglow: A Flow-based Generative Network for Speech Synthesis. ICASSP 2019: 3617-3621 - [i10]Rafael Valle, Jason Li, Ryan Prenger, Bryan Catanzaro:
Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens. CoRR abs/1910.11997 (2019) - [i9]Rafael Valle, Fitsum A. Reda, Mohammad Shoeybi, Patrick LeGresley, Andrew Tao, Bryan Catanzaro:
Neural ODEs for Image Segmentation with Level Sets. CoRR abs/1912.11683 (2019) - 2018
- [j3]Jason Poulos, Rafael Valle:
Missing Data Imputation for Supervised Learning. Appl. Artif. Intell. 32(2): 186-196 (2018) - [i8]Wilson Cai, Anish Doshi, Rafael Valle:
Attacking Speaker Recognition With Deep Generative Models. CoRR abs/1801.02384 (2018) - [i7]Rafael Valle, Wilson Cai, Anish Doshi:
TequilaGAN: How to easily identify GAN samples. CoRR abs/1807.04919 (2018) - [i6]Rafael Valle:
Visual Display and Retrieval of Music Information. CoRR abs/1807.10204 (2018) - [i5]Ryan Prenger, Rafael Valle, Bryan Catanzaro:
WaveGlow: A Flow-based Generative Network for Speech Synthesis. CoRR abs/1811.00002 (2018) - 2017
- [i4]Jason Poulos, Rafael Valle:
Attention networks for image-to-text. CoRR abs/1712.04046 (2017) - 2016
- [j2]Rafael Valle, Alexandre Donzé, Daniel J. Fremont
, Ilge Akkaya, Sanjit A. Seshia, Adrian Freed, David Wessel:
Specification Mining for Machine Improvisation with Formal Specifications. Comput. Entertain. 14(3): 6:1-6:20 (2016) - [c7]Rafael Valle:
ABROA: Audio-Based Room-Occupancy Analysis Using Gaussian Mixtures and Hidden Markov Models. DCASE 2016: 100-104 - [c6]Ilge Akkaya, Daniel J. Fremont
, Rafael Valle, Alexandre Donzé, Edward A. Lee, Sanjit A. Seshia:
Control Improvisation with Probabilistic Temporal Specifications. IoTDI 2016: 187-198 - [c5]Rafael Valle, Daniel J. Fremont, Ilge Akkaya, Alexandre Donzé, Adrian Freed, Sanjit A. Seshia:
Learning and Visualizing Music Specifications Using Pattern Graphs. ISMIR 2016: 192-198 - [i3]Rafael Valle:
ABROA : Audio-Based Room-Occupancy Analysis using Gaussian Mixtures and Hidden Markov Models. CoRR abs/1607.07801 (2016) - [i2]Jason Poulos, Rafael Valle:
Missing Data Imputation for Supervised Learning. CoRR abs/1610.09075 (2016) - 2015
- [c4]Rafael Valle, Adrian Freed:
Symbolic Music Similarity Using Neuronal Periodicity and Dynamic Programming. MCM 2015: 199-204 - [i1]Ilge Akkaya, Daniel J. Fremont, Rafael Valle, Alexandre Donzé, Edward A. Lee, Sanjit A. Seshia:
Control Improvisation with Probabilistic Temporal Specifications. CoRR abs/1511.02279 (2015) - 2014
- [c3]Alexandre Donzé, Rafael Valle, Ilge Akkaya, Sophie Libkind, Sanjit A. Seshia, David Wessel:
Machine Improvisation with Formal Specifications. ICMC 2014 - 2013
- [c2]Rafael Valle:
Towards a Dynamic, Inclusive and Equalitarian Augmented Activity Space. ICMC 2013 - [c1]Rafael Valle:
Gradual control of harmonicity in the Context of frequency modulation. ICMC 2013
1990 – 1999
- 1993
- [j1]Juan Carlos Calderón, JoséRamón Salvador, Rafael Valle, Luis París, Carles Ferrer
:
Multi-protocol communications controller. Microprocess. Microprogramming 39(2-5): 209-212 (1993)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-13 20:20 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint