default search action
Zhiyao Duan
Person information
- affiliation: University of Rochester, USA
- affiliation (former): Northwestern University, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j27]Ge Zhu, Juan Pablo Cáceres, Zhiyao Duan, Nicholas J. Bryan:
MusicHiFi: Fast High-Fidelity Stereo Vocoding. IEEE Signal Process. Lett. 31: 2365-2369 (2024) - [c72]Yongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan:
SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription. ICASSP 2024: 1286-1290 - [c71]Enting Zhou, You Zhang, Zhiyao Duan:
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech. ICASSP 2024: 12126-12130 - [c70]Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan:
SingFake: Singing Voice Deepfake Detection. ICASSP 2024: 12156-12160 - [i46]Ge Zhu, Zhiyao Duan:
Cacophony: An Improved Contrastive Audio-Text Model. CoRR abs/2402.06986 (2024) - [i45]Frank Cwitkowitz, Zhiyao Duan:
Toward Fully Self-Supervised Multi-Pitch Estimation. CoRR abs/2402.15569 (2024) - [i44]Ge Zhu, Juan Pablo Cáceres, Zhiyao Duan, Nicholas J. Bryan:
MusicHiFi: Fast High-Fidelity Stereo Vocoding. CoRR abs/2403.10493 (2024) - [i43]Yujia Yan, Zhiyao Duan:
Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription. CoRR abs/2404.09466 (2024) - [i42]You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Tomoki Toda, Zhiyao Duan:
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan. CoRR abs/2405.05244 (2024) - [i41]Yongyi Zang, Jiatong Shi, You Zhang, Ryuichi Yamamoto, Jionghao Han, Yuxun Tang, Shengyuan Xu, Wenxiao Zhao, Jing Guo, Tomoki Toda, Zhiyao Duan:
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection. CoRR abs/2406.02438 (2024) - [i40]Zehua Kcriss Li, Meiying Melissa Chen, Yi Zhong, Pinxin Liu, Zhiyao Duan:
GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis. CoRR abs/2406.10514 (2024) - [i39]Kyungbok Lee, You Zhang, Zhiyao Duan:
A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection. CoRR abs/2406.14176 (2024) - [i38]Samuele Cornell, Jordan Darefsky, Zhiyao Duan, Shinji Watanabe:
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition. CoRR abs/2408.09215 (2024) - [i37]You Zhang, Yongyi Zang, Jiatong Shi, Ryuichi Yamamoto, Tomoki Toda, Zhiyao Duan:
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge. CoRR abs/2408.16132 (2024) - 2023
- [j26]Zhiyao Duan, Peter van Kranenburg, Juhan Nam, Preeti Rao:
Editorial for TISMIR Special Collection: Cultural Diversity in MIR Research. Trans. Int. Soc. Music. Inf. Retr. 6(1): 203-205 (2023) - [c69]Siwen Ding, You Zhang, Zhiyao Duan:
SAMO: Speaker Attractor Multi-Center One-Class Learning For Voice Anti-Spoofing. ICASSP 2023: 1-5 - [c68]Mojtaba Heydari, Ju-Chiang Wang, Zhiyao Duan:
SingNet: a real-time Singing Voice beat and Downbeat Tracking System. ICASSP 2023: 1-5 - [c67]You Zhang, Yuxiang Wang, Zhiyao Duan:
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields. ICASSP 2023: 1-5 - [c66]Ge Zhu, Yujia Yan, Juan Pablo Cáceres, Zhiyao Duan:
Transcription Free Filler Word Detection with Neural Semi-CRFs. ICASSP 2023: 1-5 - [c65]Meiying Chen, Zhiyao Duan:
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed. INTERSPEECH 2023: 2098-2102 - [c64]Yongyi Zang, You Zhang, Zhiyao Duan:
Phase perturbation improves channel robustness for speech spoofing countermeasures. INTERSPEECH 2023: 3162-3166 - [c63]Qiaoyu Yang, Frank Cwitkowitz, Zhiyao Duan:
Harmonic Analysis With Neural Semi-CRF. ISMIR 2023: 676-683 - [c62]Yutong Wen, You Zhang, Zhiyao Duan:
Mitigating Cross-Database Differences for Learning Unified HRTF Representation. WASPAA 2023: 1-5 - [i36]Ge Zhu, Yujia Yan, Juan Pablo Cáceres, Zhiyao Duan:
Transcription free filler word detection with Neural semi-CRFs. CoRR abs/2303.06475 (2023) - [i35]Yongyi Zang, You Zhang, Zhiyao Duan:
Phase perturbation improves channel robustness for speech spoofing countermeasures. CoRR abs/2306.03389 (2023) - [i34]Yutong Wen, You Zhang, Zhiyao Duan:
Mitigating Cross-Database Differences for Learning Unified HRTF Representation. CoRR abs/2307.14547 (2023) - [i33]Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan:
SingFake: Singing Voice Deepfake Detection. CoRR abs/2309.07525 (2023) - [i32]Yongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan:
SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription. CoRR abs/2309.09085 (2023) - [i31]Ge Zhu, Yutong Wen, Marc-André Carbonneau, Zhiyao Duan:
EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis. CoRR abs/2311.08667 (2023) - 2022
- [j25]Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan:
Music Source Separation With Generative Flow. IEEE Signal Process. Lett. 29: 2288-2292 (2022) - [j24]Christodoulos Benetatos, Zhiyao Duan:
Draw and Listen! A Sketch-Based System for Music Inpainting. Trans. Int. Soc. Music. Inf. Retr. 5(1): 141-155 (2022) - [j23]Sefik Emre Eskimez, You Zhang, Zhiyao Duan:
Speech Driven Talking Face Generation From a Single Image and an Emotion Condition. IEEE Trans. Multim. 24: 3480-3490 (2022) - [c61]Mojtaba Heydari, Matthew C. McCallum, Andreas F. Ehmann, Zhiyao Duan:
A Novel 1D State Space for Efficient Music Rhythmic Analysis. ICASSP 2022: 421-425 - [c60]Rui Lu, Baigong Zheng, Jiarui Hai, Fei Tao, Zhiyao Duan, Ji Liu:
Progressive Teacher-Student Training Framework for Music Tagging. ICASSP 2022: 3129-3133 - [c59]Ge Zhu, Frank Cwitkowitz, Zhiyao Duan:
A Study of The Robustness of Raw Waveform Based Speaker Embeddings Under Mismatched Conditions. ICASSP 2022: 7657-7661 - [c58]Mojtaba Heydari, Zhiyao Duan:
Singing beat tracking with Self-supervised front-end and linear transformers. ISMIR 2022: 617-624 - [c57]Abudukelimu Wuerkaixi, You Zhang, Zhiyao Duan, Changshui Zhang:
Rethinking Audio-Visual Synchronization for Active Speaker Detection. MLSP 2022: 1-6 - [c56]Abudukelimu Wuerkaixi, Kunda Yan, You Zhang, Zhiyao Duan, Changshui Zhang:
DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization. MMSP 2022: 1-6 - [c55]You Zhang, Ge Zhu, Zhiyao Duan:
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification. Odyssey 2022: 77-84 - [i30]You Zhang, Ge Zhu, Zhiyao Duan:
A New Fusion Strategy for Spoofing Aware Speaker Verification. CoRR abs/2202.05253 (2022) - [i29]Frank Cwitkowitz, Jonathan Driedger, Zhiyao Duan:
A Data-Driven Methodology for Considering Feasibility and Pairwise Likelihood in Deep Learning Based Guitar Tablature Transcription Systems. CoRR abs/2204.08094 (2022) - [i28]Ge Zhu, Jordan Darefsky, Fei Jiang, Anton Selitskiy, Zhiyao Duan:
Music Source Separation with Generative Flow. CoRR abs/2204.09079 (2022) - [i27]Abudukelimu Wuerkaixi, You Zhang, Zhiyao Duan, Changshui Zhang:
Rethinking Audio-visual Synchronization for Active Speaker Detection. CoRR abs/2206.10421 (2022) - [i26]Yuxiang Wang, You Zhang, Zhiyao Duan, Mark Bocko:
Predicting Global Head-Related Transfer Functions From Scanned Head Geometry Using Deep Learning and Compact Representations. CoRR abs/2207.14352 (2022) - [i25]Meiying Chen, Zhiyao Duan:
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm. CoRR abs/2209.11866 (2022) - [i24]You Zhang, Yuxiang Wang, Zhiyao Duan:
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields. CoRR abs/2210.15196 (2022) - [i23]Siwen Ding, You Zhang, Zhiyao Duan:
SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing. CoRR abs/2211.02718 (2022) - 2021
- [j22]You Zhang, Fei Jiang, Zhiyao Duan:
One-Class Learning Towards Synthetic Voice Spoofing Detection. IEEE Signal Process. Lett. 28: 937-941 (2021) - [j21]Bochen Li, Yuxuan Wang, Zhiyao Duan:
Audiovisual Singing Voice Separation. Trans. Int. Soc. Music. Inf. Retr. 4(1): 195-209 (2021) - [c54]Mojtaba Heydari, Zhiyao Duan:
Don't Look Back: An Online Beat Tracking Method Using RNN and Enhanced Particle Filtering. ICASSP 2021: 236-240 - [c53]Ge Zhu, Fei Jiang, Zhiyao Duan:
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding. Interspeech 2021: 96-100 - [c52]You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan:
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems. Interspeech 2021: 4309-4313 - [c51]Mojtaba Heydari, Frank Cwitkowitz, Zhiyao Duan:
BeatNet: CRNN and Particle Filtering for Online Joint Beat, Downbeat and Meter Tracking. ISMIR 2021: 270-277 - [c50]Abudukelimu Wuerkaixi, Christodoulos Benetatos, Zhiyao Duan, Changshui Zhang:
CollageNet: Fusing arbitrary melody and accompaniment into a coherent song. ISMIR 2021: 786-793 - [c49]Yujia Yan, Frank Cwitkowitz, Zhiyao Duan:
Skipping the Frame-Level: Event-Based Piano Transcription With Neural Semi-CRFs. NeurIPS 2021: 20583-20595 - [e2]Jin Ha Lee, Alexander Lerch, Zhiyao Duan, Juhan Nam, Preeti Rao, Peter van Kranenburg, Ajay Srinivasamurthy:
Proceedings of the 22nd International Society for Music Information Retrieval Conference, ISMIR 2021, Online, November 7-12, 2021. 2021, ISBN 978-1-7327299-0-2 [contents] - [i22]You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan:
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems. CoRR abs/2104.01320 (2021) - [i21]Bochen Li, Yuxuan Wang, Zhiyao Duan:
Audiovisual Singing Voice Separation. CoRR abs/2107.00231 (2021) - [i20]Xinhui Chen, You Zhang, Ge Zhu, Zhiyao Duan:
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021. CoRR abs/2107.12018 (2021) - [i19]Mojtaba Heydari, Frank Cwitkowitz, Zhiyao Duan:
BeatNet: CRNN and Particle Filtering for Online Joint Beat Downbeat and Meter Tracking. CoRR abs/2108.03576 (2021) - [i18]Frank Cwitkowitz, Mojtaba Heydari, Zhiyao Duan:
Learning Sparse Analytic Filters for Piano Transcription. CoRR abs/2108.10382 (2021) - [i17]Ge Zhu, Frank Cwitkowitz, Zhiyao Duan:
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions. CoRR abs/2110.04265 (2021) - [i16]Mojtaba Heydari, Matthew C. McCallum, Andreas F. Ehmann, Zhiyao Duan:
A Novel 1D State Space for Efficient Music Rhythmic Analysis. CoRR abs/2111.00704 (2021) - 2020
- [j20]Fei Jiang, Zhiyao Duan:
Speaker Attractor Network: Generalizing Speech Separation to Unseen Numbers of Sources. IEEE Signal Process. Lett. 27: 1859-1863 (2020) - [j19]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Noise-Resilient Training Method for Face Landmark Generation From Speech. IEEE ACM Trans. Audio Speech Lang. Process. 28: 27-38 (2020) - [c48]Nan Jiang, Sheng Jin, Zhiyao Duan, Changshui Zhang:
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning. AAAI 2020: 710-718 - [c47]Yichi Zhang, Junbo Hu, Yiting Zhang, Bryan Pardo, Zhiyao Duan:
Vroom!: A Search Engine for Sounds by Vocal Imitation Queries. CHIIR 2020: 23-32 - [c46]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
End-To-End Generation of Talking Faces from Noisy Speech. ICASSP 2020: 1948-1952 - [c45]Christodoulos Benetatos, Joseph VanderStel, Zhiyao Duan:
BachDuet: A Deep Learning System for Human-Machine Counterpoint Improvisation. NIME 2020: 635-640 - [c44]Nan Jiang, Sheng Jin, Zhiyao Duan, Changshui Zhang:
When Counterpoint Meets Chinese Folk Melodies. NeurIPS 2020 - [i15]Nan Jiang, Sheng Jin, Zhiyao Duan, Changshui Zhang:
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning. CoRR abs/2002.03082 (2020) - [i14]Sefik Emre Eskimez, You Zhang, Zhiyao Duan:
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition. CoRR abs/2008.03592 (2020) - [i13]Runze Su, Fei Tao, Xudong Liu, Haoran Wei, Xiaorong Mei, Zhiyao Duan, Lei Yuan, Ji Liu, Yuying Xie:
Themes Inferred Audio-visual Correspondence Learning. CoRR abs/2009.06573 (2020) - [i12]Ge Zhu, Fei Jiang, Zhiyao Duan:
Raw-x-vector: Multi-scale Time Domain Speaker Embedding Network. CoRR abs/2010.12951 (2020) - [i11]You Zhang, Fei Jiang, Zhiyao Duan:
One-class learning towards generalized voice spoofing detection. CoRR abs/2010.13995 (2020) - [i10]Mojtaba Heydari, Zhiyao Duan:
Do not look back: an online beat tracking method using RNN and enhanced particle filtering. CoRR abs/2011.02619 (2020)
2010 – 2019
- 2019
- [j18]Sefik Emre Eskimez, Kazuhito Koishida, Zhiyao Duan:
Adversarial Training for Speech Super-Resolution. IEEE J. Sel. Top. Signal Process. 13(2): 347-358 (2019) - [j17]Emmanouil Benetos, Simon Dixon, Zhiyao Duan, Sebastian Ewert:
Automatic Music Transcription: An Overview. IEEE Signal Process. Mag. 36(1): 20-30 (2019) - [j16]Zhiyao Duan, Slim Essid, Cynthia C. S. Liem, Gaël Richard, Gaurav Sharma:
Audiovisual Analysis of Music Performances: Overview of an Emerging Field. IEEE Signal Process. Mag. 36(1): 63-73 (2019) - [j15]Yichi Zhang, Bryan Pardo, Zhiyao Duan:
Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation. IEEE ACM Trans. Audio Speech Lang. Process. 27(2): 429-441 (2019) - [j14]Rui Lu, Zhiyao Duan, Changshui Zhang:
Audio-Visual Deep Clustering for Speech Separation. IEEE ACM Trans. Audio Speech Lang. Process. 27(11): 1697-1712 (2019) - [j13]Bochen Li, Karthik Dinesh, Chenliang Xu, Gaurav Sharma, Zhiyao Duan:
Online Audio-Visual Source Association for Chamber Music Performances. Trans. Int. Soc. Music. Inf. Retr. 2(1): 29-42 (2019) - [j12]Bochen Li, Xinzhao Liu, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma:
Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications. IEEE Trans. Multim. 21(2): 522-535 (2019) - [c43]Lele Chen, Haitian Zheng, Ross K. Maddox, Zhiyao Duan, Chenliang Xu:
Sound to Visual: Hierarchical Cross-Modal Talking Face Generation. CVPR Workshops 2019: 1-4 - [c42]Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu:
Audio-Visual Event Localization in the Wild. CVPR Workshops 2019: 5-8 - [c41]Lele Chen, Ross K. Maddox, Zhiyao Duan, Chenliang Xu:
Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss. CVPR 2019: 7832-7841 - [i9]Lele Chen, Ross K. Maddox, Zhiyao Duan, Chenliang Xu:
Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss. CoRR abs/1905.03820 (2019) - [i8]Yichi Zhang, Yiting Zhang, Zhiyao Duan:
Sound Search by Text Description or Vocal Imitation? CoRR abs/1907.08661 (2019) - [i7]Mingrui Yuan, Zhiyao Duan:
Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis. CoRR abs/1910.13054 (2019) - 2018
- [j11]Sefik Emre Eskimez, Peter Soufleris, Zhiyao Duan, Wendi B. Heinzelman:
Front-end speech enhancement for commercial speaker verification systems. Speech Commun. 99: 101-113 (2018) - [j10]Rui Lu, Zhiyao Duan, Changshui Zhang:
Listen and Look: Audio-Visual Matching Assisted Speech Source Separation. IEEE Signal Process. Lett. 25(9): 1315-1319 (2018) - [c40]Bongjun Kim, Madhav Ghei, Bryan Pardo, Zhiyao Duan:
Vocal Imitation Set: a dataset of vocally imitated sound events using the AudioSet ontology. DCASE 2018: 148-152 - [c39]Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu:
Audio-Visual Event Localization in Unconstrained Videos. ECCV (2) 2018: 252-268 - [c38]Lele Chen, Zhiheng Li, Ross K. Maddox, Zhiyao Duan, Chenliang Xu:
Lip Movements Generation at a Glance. ECCV (7) 2018: 538-553 - [c37]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Generating Talking Face Landmarks from Speech. LVA/ICA 2018: 372-381 - [c36]Rui Lu, Zhiyao Duan, Changshui Zhang:
Multi-Scale Recurrent Neural Network for Sound Event Detection. ICASSP 2018: 131-135 - [c35]Xueyang Wang, Ryan Stables, Bochen Li, Zhiyao Duan:
Score-Aligned Polyphonic Microtiming Estimation. ICASSP 2018: 361-365 - [c34]Yichi Zhang, Zhiyao Duan:
Visualization and Interpretation of Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation. ICASSP 2018: 2406-2410 - [c33]Zhihan Zhou, Yichi Zhang, Zhiyao Duan:
Joint Speaker Diarization and Recognition Using Convolutional and Recurrent Neural Networks. ICASSP 2018: 2496-2500 - [c32]Sefik Emre Eskimez, Zhiyao Duan, Wendi B. Heinzelman:
Unsupervised Learning Approach to Feature Analysis for Automatic Speech Emotion Recognition. ICASSP 2018: 5099-5103 - [c31]Yujia Yan, Ethan Lustig, Joseph VanderStel, Zhiyao Duan:
Part-invariant Model for Music Generation and Harmonization. ISMIR 2018: 204-210 - [c30]Bochen Li, Akira Maezawa, Zhiyao Duan:
Skeleton Plays Piano: Online Generation of Pianist Body Movements from MIDI Performance. ISMIR 2018: 218-224 - [i6]Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, Chenliang Xu:
Audio-Visual Event Localization in Unconstrained Videos. CoRR abs/1803.08842 (2018) - [i5]Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Generating Talking Face Landmarks from Speech. CoRR abs/1803.09803 (2018) - [i4]Lele Chen, Zhiheng Li, Ross K. Maddox, Zhiyao Duan, Chenliang Xu:
Lip Movements Generation at a Glance. CoRR abs/1803.10404 (2018) - 2017
- [j9]Na Yang, Jianbo Yuan, Yun Zhou, Ilker Demirkol, Zhiyao Duan, Wendi B. Heinzelman, Melissa Sturge-Apple:
Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification. Int. J. Speech Technol. 20(1): 27-41 (2017) - [j8]Andrea Cogliati, Zhiyao Duan, Brendt Wohlberg:
Piano Transcription With Convolutional Sparse Lateral Inhibition. IEEE Signal Process. Lett. 24(4): 392-396 (2017) - [c29]Rui Lu, Kailun Wu, Zhiyao Duan, Changshui Zhang:
Deep ranking: Triplet MatchNet for music metric learning. ICASSP 2017: 121-125 - [c28]Bochen Li, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma:
See and listen: Score-informed association of sound tracks to players in chamber music performance videos. ICASSP 2017: 2906-2910 - [c27]Karthik Dinesh, Bochen Li, Xinzhao Liu, Zhiyao Duan, Gaurav Sharma:
Visually informed multi-pitch analysis of string ensembles. ICASSP 2017: 3021-3025 - [c26]Bochen Li, Karthik Dinesh, Gaurav Sharma, Zhiyao Duan:
Video-Based Vibrato Detection and Analysis for Polyphonic String Music. ISMIR 2017: 123-130 - [c25]Andrea Cogliati, Zhiyao Duan:
A Metric for Music Notation Transcription Accuracy. ISMIR 2017: 407-413 - [c24]Lele Chen, Sudhanshu Srivastava, Zhiyao Duan, Chenliang Xu:
Deep Cross-Modal Audio-Visual Generation. ACM Multimedia (Thematic Workshops) 2017: 349-357 - [c23]Rui Lu, Zhiyao Duan, Changshui Zhang:
Metric learning based data augmentation for environmental sound classification. WASPAA 2017: 1-5 - [c22]Yichi Zhang, Zhiyao Duan:
IMINET: Convolutional semi-siamese networks for sound search by vocal imitation. WASPAA 2017: 304-308 - [e1]Sally Jo Cunningham, Zhiyao Duan, Xiao Hu, Douglas Turnbull:
Proceedings of the 18th International Society for Music Information Retrieval Conference, ISMIR 2017, Suzhou, China, October 23-27, 2017. 2017, ISBN 978-981-11-5179-8 [contents] - [i3]Lele Chen, Sudhanshu Srivastava, Zhiyao Duan, Chenliang Xu:
Deep Cross-Modal Audio-Visual Generation. CoRR abs/1704.08292 (2017) - 2016
- [j7]Andrea Cogliati, Zhiyao Duan, Brendt Wohlberg:
Context-Dependent Piano Music Transcription With Convolutional Sparse Coding. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2218-2230 (2016) - [j6]Bochen Li, Zhiyao Duan:
An Approach to Score Following for Piano Performances With the Sustained Effect. IEEE ACM Trans. Audio Speech Lang. Process. 24(12): 2425-2438 (2016) - [c21]Yichi Zhang, Zhiyao Duan:
IMISOUND: An unsupervised system for sound query by vocal imitation. ICASSP 2016: 2269-2273 - [c20]Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
Emotion classification: How does an automated system compare to Naive human coders? ICASSP 2016: 2274-2278 - [c19]Sefik Emre Eskimez, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
WISE: Web-based Interactive Speech Emotion Classification. SAAIP@IJCAI 2016: 2-7 - [c18]Andrea Cogliati, David Temperley, Zhiyao Duan:
Transcribing Human Piano Performances into Music Notation. ISMIR 2016: 758-764 - [i2]Bochen Li, Xinzhao Liu, Karthik Dinesh, Zhiyao Duan, Gaurav Sharma:
Creating A Musical Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications. CoRR abs/1612.08727 (2016) - 2015
- [c17]Andrea Cogliati, Zhiyao Duan:
Piano music transcription modeling note temporal evolution. ICASSP 2015: 429-433 - [c16]Bochen Li, Zhiyao Duan:
Score Following for Piano Performances with Sustain-Pedal Effects. ISMIR 2015: 469-475 - [c15]Andrea Cogliati, Zhiyao Duan, Brendt Wohlberg:
Piano music transcription with fast convolutional sparse coding. MLSP 2015: 1-6 - [c14]Yichi Zhang, Zhiyao Duan:
Retrieving sounds by vocal imitation recognition. MLSP 2015: 1-6 - [c13]Jun Zhou, Shuo Chen, Zhiyao Duan:
Rotational reset strategy for online semi-supervised NMF-based speech enhancement for long recordings. WASPAA 2015: 1-5 - [i1]Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge-Apple, Zhiyao Duan, Wendi B. Heinzelman:
Emotion Classification: How Does an Automated System Compare to Naive Human Coders? CoRR abs/1510.06769 (2015) - 2014
- [j5]Zhiyao Duan, Jinyu Han, Bryan Pardo:
Multi-pitch Streaming of Harmonic Sound Mixtures. IEEE ACM Trans. Audio Speech Lang. Process. 22(1): 138-150 (2014) - [j4]Zafar Rafii, Zhiyao Duan, Bryan Pardo:
Combining rhythm-based and pitch-based methods for background and melody separation. IEEE ACM Trans. Audio Speech Lang. Process. 22(12): 1884-1893 (2014) - [c12]Zhiyao Duan, Bryan Pardo, Laurent Daudet:
A novel cepstral representation for timbre modeling of sound sources in polyphonic mixtures. ICASSP 2014: 7495-7499 - [c11]Zhiyao Duan, David Temperley:
Note-level Music Transcription by Maximum Likelihood Sampling. ISMIR 2014: 181-186 - 2012
- [c10]Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis:
Online PLCA for Real-Time Semi-supervised Source Separation. LVA/ICA 2012: 34-41 - [c9]Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis:
Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments. INTERSPEECH 2012: 595-598 - 2011
- [j3]Zhiyao Duan, Bryan Pardo:
Soundprism: An Online System for Score-Informed Source Separation of Music Audio. IEEE J. Sel. Top. Signal Process. 5(6): 1205-1215 (2011) - [c8]Zhiyao Duan, Bryan Pardo:
A state space model for online polyphonic audio-score alignment. ICASSP 2011: 197-200 - [c7]Zhiyao Duan, Bryan Pardo:
Aligning Semi-Improvised Music Audio with Its Lead Sheet. ISMIR 2011: 513-518 - 2010
- [j2]Zhiyao Duan, Bryan Pardo, Changshui Zhang:
Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-Peak Regions. IEEE Trans. Speech Audio Process. 18(8): 2121-2133 (2010) - [c6]Zhiyao Duan, Jinyu Han, Bryan Pardo:
Song-level multi-pitch tracking by heavily constrained clustering. ICASSP 2010: 57-60
2000 – 2009
- 2009
- [c5]Zhiyao Duan, Jinyu Han, Bryan Pardo:
Harmonically Informed Multi-Pitch Tracking. ISMIR 2009: 333-338 - 2008
- [j1]Zhiyao Duan, Yungang Zhang, Changshui Zhang, Zhenwei Shi:
Unsupervised Single-Channel Music Source Separation by Average Harmonic Structure Modeling. IEEE Trans. Speech Audio Process. 16(4): 766-778 (2008) - [c4]Zhiyao Duan, Lie Lu, Changshui Zhang:
Audio tonality mode classification without tonic annotations. ICME 2008: 1361-1364 - [c3]Zhiyao Duan, Lie Lu, Changshui Zhang:
Collective Annotation of Music from Multiple Semantic Categories. ISMIR 2008: 237-242 - 2007
- [c2]Nelson Lee, Zhiyao Duan, Julius O. Smith III:
Excitation signal Extraction for Guitar tones. ICMC 2007 - [c1]Zhiyao Duan, Dan Zhang, Changshui Zhang, Zhenwei Shi:
Multi-Pitch Estimation Based on Partial Event and Support Transfer. ICME 2007: 216-219
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:21 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint