CCSRD consists of a content encoder that encodes linguistic content information from the speech input, a non-content encoder that models non-linguistic speech features, and a disentanglement module that learns disentangled representations with a cyclic reconstructor, feature reconstructor and speaker classifier trained ...
In this paper, we propose a content-centric speech representation disentanglement learning framework for speech translation, CCSRD, which decomposes speech ...
CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation. EMNLP 2023 Findings. Tianhao Shen, Renren Jin ...
CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation · Xiaohu ZhaoHaoran SunYikun LeiShaolin ZhuDeyi Xiong.
MS student, Tianjin University · CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation · Xiaohu Zhao, Haoran ...
Aug 22, 2024 · Since speech often contains multiple factors, disentangled representation learning provides a way to extract different representations for ...
CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation · Computer Science, Linguistics. Conference on Empirical ...
People also ask
What are end-to-end models for speech processing?
What is end-to-end simultaneous speech translation?
CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation. X Zhao, H Sun, Y Lei, S Zhu, D Xiong. Findings of the ...
CCSRD: Content-Centric Speech Representation Disentanglement Learning for End-to-End Speech Translation. Conference Paper. Jan 2023. Xiaohu Zhao · Haoran Sun ...
Dec 19, 2022 · In this paper, we propose Word-Aligned COntrastive learning (WACO), a simple and effective method for extremely low-resource speech-to-text translation.
Missing: CCSRD: Centric Disentanglement