default search action
Tomoki Hayashi
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c67]Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polak, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe:
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit. ACL (demo) 2023: 400-411 - [c66]Kazuhiro Kobayashi, Tomoki Hayashi, Tomoki Toda:
Low-Latency Electrolaryngeal Speech Enhancement Based on Fastspeech2-Based Voice Conversion and Self-Supervised Speech Representation. ICASSP 2023: 1-5 - [i39]Massa Baali, Tomoki Hayashi, Hamdy Mubarak, Soumi Maiti, Shinji Watanabe, Wassim El-Hajj, Ahmed Ali:
Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study. CoRR abs/2301.09099 (2023) - 2022
- [j9]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Tomoki Toda:
A Comparative Study of Self-Supervised Speech Representation Based Voice Conversion. IEEE J. Sel. Top. Signal Process. 16(6): 1308-1318 (2022) - [c65]Robin Karlsson, Tomoki Hayashi, Keisuke Fujii, Alexander Carballo, Kento Ohtani, Kazuya Takeda:
Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment. BMVC 2022: 699 - [c64]Sehun Kim, Tomoki Hayashi, Tomoki Toda:
Note-level Automatic Guitar Transcription Using Attention Mechanism. EUSIPCO 2022: 229-233 - [c63]Ibuki Kuroyanagi, Tomoki Hayashi, Kazuya Takeda, Tomoki Toda:
Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure. EUSIPCO 2022: 294-298 - [c62]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda:
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations. ICASSP 2022: 6552-6556 - [c61]Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
An Investigation of Streaming Non-Autoregressive sequence-to-sequence Voice Conversion. ICASSP 2022: 6802-6806 - [c60]Jiatong Shi, Shuai Guo, Tao Qian, Tomoki Hayashi, Yuning Wu, Fangzheng Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis. INTERSPEECH 2022: 4277-4281 - [c59]Takumi Yamamoto, Kento Ohtani, Tomoki Hayashi, Alexander Carballo, Kazuya Takeda:
Efficient Training Method for Point Cloud-Based Object Detection Models by Combining Environmental Transitions and Active Learning. RiTA 2022: 292-303 - [i38]Tatsuya Komatsu, Shinji Watanabe, Koichi Miyazaki, Tomoki Hayashi:
Acoustic Event Detection with Classifier Chains. CoRR abs/2202.08470 (2022) - [i37]Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin:
Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis. CoRR abs/2205.04029 (2022) - [i36]Ibuki Kuroyanagi, Tomoki Hayashi, Kazuya Takeda, Tomoki Toda:
Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure. CoRR abs/2206.05929 (2022) - [i35]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Tomoki Toda:
A Comparative Study of Self-supervised Speech Representation Based Voice Conversion. CoRR abs/2207.04356 (2022) - 2021
- [j8]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Pretraining Techniques for Sequence-to-Sequence Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 29: 745-755 (2021) - [j7]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network. IEEE ACM Trans. Audio Speech Lang. Process. 29: 792-806 (2021) - [j6]Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda:
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1134-1148 (2021) - [c58]Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda:
On Prosody Modeling for ASR+TTS Based Voice Conversion. ASRU 2021: 642-649 - [c57]Ibuki Kuroyanagi, Tomoki Hayashi, Yusuke Adachi, Takenori Yoshimura, Kazuya Takeda, Tomoki Toda:
An Ensemble Approach to Anomalous Sound Detection Based on Conformer-Based Autoencoder and Binary Classifier Incorporated with Metric Learning. DCASE 2021: 110-114 - [c56]Chaitanya Prasad Narisetty, Tomoki Hayashi, Ryunosuke Ishizaki, Shinji Watanabe, Kazuya Takeda:
Leveraging State-of-the-art ASR Techniques to Audio Captioning. DCASE 2021: 160-164 - [c55]Tomoki Hayashi, Takenori Yoshimura, Masaya Inuzuka, Ibuki Kuroyanagi, Osamu Segawa:
Spontaneous Speech Summarization: Transformers All The Way Through. EUSIPCO 2021: 456-460 - [c54]Ibuki Kuroyanagi, Tomoki Hayashi, Kazuya Takeda, Tomoki Toda:
Anomalous Sound Detection Using a Binary Classification Model and Class Centroids. EUSIPCO 2021: 1995-1999 - [c53]Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on Espnet Toolkit Boosted By Conformer. ICASSP 2021: 5874-5878 - [c52]Kazuhiro Kobayashi, Wen-Chin Huang, Yi-Chiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda:
Crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder. ICASSP 2021: 5934-5938 - [c51]Wen-Chin Huang, Yi-Chiao Wu, Tomoki Hayashi:
Any-to-One Sequence-to-Sequence Voice Conversion Using Self-Supervised Discrete Speech Representations. ICASSP 2021: 5944-5948 - [c50]Tomoki Hayashi, Wen-Chin Huang, Kazuhiro Kobayashi, Tomoki Toda:
Non-Autoregressive Sequence-To-Sequence Voice Conversion. ICASSP 2021: 7068-7072 - [c49]Tatsuya Komatsu, Shinji Watanabe, Koichi Miyazaki, Tomoki Hayashi:
Acoustic Event Detection with Classifier Chains. Interspeech 2021: 601-605 - [c48]Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Böddeker, Zhuo Chen, Shinji Watanabe:
ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration. SLT 2021: 785-792 - [i34]Kazuhiro Kobayashi, Wen-Chin Huang, Yi-Chiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda:
crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder. CoRR abs/2103.02858 (2021) - [i33]Tomoki Hayashi, Wen-Chin Huang, Kazuhiro Kobayashi, Tomoki Toda:
Non-autoregressive sequence-to-sequence voice conversion. CoRR abs/2104.06793 (2021) - [i32]Ibuki Kuroyanagi, Tomoki Hayashi, Kazuya Takeda, Tomoki Toda:
Anomalous Sound Detection Using a Binary Classification Model and Class Centroids. CoRR abs/2106.06151 (2021) - [i31]Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda:
On Prosody Modeling for ASR+TTS based Voice Conversion. CoRR abs/2107.09477 (2021) - [i30]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda:
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations. CoRR abs/2110.06280 (2021) - [i29]Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe:
ESPnet2-TTS: Extending the Edge of TTS Research. CoRR abs/2110.07840 (2021) - [i28]Robin Karlsson, Tomoki Hayashi, Keisuke Fujii, Alexander Carballo, Kento Ohtani, Kazuya Takeda:
ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel Appearance Invariant Semantic Representations. CoRR abs/2111.12460 (2021) - [i27]Jing Shi, Xuankai Chang, Tomoki Hayashi, Yen-Ju Lu, Shinji Watanabe, Bo Xu:
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem. CoRR abs/2112.09382 (2021) - 2020
- [j5]Yi-Chiao Wu, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Hayashi, Tomoki Toda:
Non-Parallel Voice Conversion System With WaveNet Vocoder and Collapsed Speech Suppression. IEEE Access 8: 62094-62106 (2020) - [c47]Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Yalta, Tomoki Hayashi, Shinji Watanabe:
ESPnet-ST: All-in-One Speech Translation Toolkit. ACL (demo) 2020: 302-311 - [c46]Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda:
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS. Blizzard Challenge / Voice Conversion Challenge 2020 - [c45]Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda:
Conformer-Based Sound Event Detection with Semi-Supervised Learning and Data Augmentation. DCASE 2020: 100-104 - [c44]Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda:
Weakly-Supervised Sound Event Detection with Self-Attention. ICASSP 2020: 66-70 - [c43]Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, Shinji Watanabe:
End-to-End Automatic Speech Recognition Integrated with CTC-Based Voice Activity Detection. ICASSP 2020: 6999-7003 - [c42]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction. ICASSP 2020: 7204-7208 - [c41]Katsuki Inoue, Sunao Hara, Masanobu Abe, Tomoki Hayashi, Ryuichi Yamamoto, Shinji Watanabe:
Semi-Supervised Speaker Adaptation for End-to-End Speech Synthesis with Pretrained Models. ICASSP 2020: 7634-7638 - [c40]Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. ICASSP 2020: 7654-7658 - [c39]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation. INTERSPEECH 2020: 3535-3539 - [c38]Shu Hikosaka, Shogo Seki, Tomoki Hayashi, Kazuhiro Kobayashi, Kazuya Takeda, Hideki Banno, Tomoki Toda:
Intelligibility Enhancement Based on Speech Waveform Modification Using Hearing Impairment. INTERSPEECH 2020: 4059-4063 - [c37]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining. INTERSPEECH 2020: 4676-4680 - [c36]Patrick Lumban Tobing, Tomoki Hayashi, Yi-Chiao Wu, Kazuhiro Kobayashi, Tomoki Toda:
Cyclic Spectral Modeling for Unsupervised Unit Discovery into Voice Conversion with Excitation and Waveform Modeling. INTERSPEECH 2020: 4861-4865 - [i26]Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, Shinji Watanabe:
End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection. CoRR abs/2002.00551 (2020) - [i25]Yi-Chiao Wu, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Hayashi, Tomoki Toda:
Non-parallel Voice Conversion System with WaveNet Vocoder and Collapsed Speech Suppression. CoRR abs/2003.11750 (2020) - [i24]Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Enrique Yalta Soplin, Tomoki Hayashi, Shinji Watanabe:
ESPnet-ST: All-in-One Speech Translation Toolkit. CoRR abs/2004.10234 (2020) - [i23]Tomoki Hayashi, Shinji Watanabe:
DiscreTalk: Text-to-Speech as a Machine Translation Problem. CoRR abs/2005.05525 (2020) - [i22]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation. CoRR abs/2005.08654 (2020) - [i21]Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda:
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network. CoRR abs/2007.05663 (2020) - [i20]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network. CoRR abs/2007.12955 (2020) - [i19]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Pretraining Techniques for Sequence-to-Sequence Voice Conversion. CoRR abs/2008.03088 (2020) - [i18]Wen-Chin Huang, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda:
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS. CoRR abs/2010.02434 (2020) - [i17]Wen-Chin Huang, Yi-Chiao Wu, Tomoki Hayashi, Tomoki Toda:
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations. CoRR abs/2010.12231 (2020) - [i16]Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, Yuekai Zhang:
Recent Developments on ESPnet Toolkit Boosted by Conformer. CoRR abs/2010.13956 (2020) - [i15]Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Böddeker, Zhuo Chen, Shinji Watanabe:
ESPnet-se: end-to-end speech enhancement and separation toolkit designed for asr integration. CoRR abs/2011.03706 (2020) - [i14]Shinji Watanabe, Florian Boyer, Xuankai Chang, Pengcheng Guo, Tomoki Hayashi, Yosuke Higuchi, Takaaki Hori, Wen-Chin Huang, Hirofumi Inaguma, Naoyuki Kamo, Shigeki Karita, Chenda Li, Jing Shi, Aswin Shanmugam Subramanian, Wangyou Zhang:
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans. CoRR abs/2012.13006 (2020)
2010 – 2019
- 2019
- [j4]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Voice Conversion With CycleRNN-Based Spectral Mapping and Finely Tuned WaveNet Vocoder. IEEE Access 7: 171114-171125 (2019) - [c35]Patrick Lumban Tobing, Tomoki Hayashi, Tomoki Toda:
Investigation of Shallow Wavenet Vocoder with Laplacian Distribution Output. ASRU 2019: 176-183 - [c34]Shigeki Karita, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto:
A Comparative Study on Transformer vs RNN in Speech Applications. ASRU 2019: 449-456 - [c33]Osamu Segawa, Tomoki Hayashi, Kazuya Takeda:
Attention-Based Speech Recognition Using Gaze Information. ASRU 2019: 465-470 - [c32]Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion. EUSIPCO 2019: 1-5 - [c31]Tatsuya Komatsu, Tomoki Hayashi, Reishi Kondo, Tomoki Toda, Kazuya Takeda:
Scene-dependent Anomalous Acoustic-event Detection Based on Conditional Wavenet and I-vector. ICASSP 2019: 870-874 - [c30]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency Training for End-to-end Speech Recognition. ICASSP 2019: 6271-6275 - [c29]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Voice Conversion with Cyclic Recurrent Neural Network and Fine-tuned Wavenet Vocoder. ICASSP 2019: 6815-6819 - [c28]Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda:
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation. INTERSPEECH 2019: 196-200 - [c27]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Non-Parallel Voice Conversion with Cyclic Variational Autoencoder. INTERSPEECH 2019: 674-678 - [c26]Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion. INTERSPEECH 2019: 709-713 - [c25]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Shubham Toshniwal, Karen Livescu:
Pre-Trained Text Embeddings for Enhanced Text-to-Speech Synthesis. INTERSPEECH 2019: 4430-4434 - [c24]Yi-Chiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Statistical Voice Conversion with Quasi-periodic WaveNet Vocoder. SSW 2019: 63-68 - [i13]Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Investigation of F0 conditioning and Fully Convolutional Networks in Variational Autoencoder based Voice Conversion. CoRR abs/1905.00615 (2019) - [i12]Yi-Chiao Wu, Tomoki Hayashi, Patrick Lumban Tobing, Kazuhiro Kobayashi, Tomoki Toda:
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation. CoRR abs/1907.00797 (2019) - [i11]Yi-Chiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Statistical Voice Conversion with Quasi-Periodic WaveNet Vocoder. CoRR abs/1907.08940 (2019) - [i10]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
Non-Parallel Voice Conversion with Cyclic Variational Autoencoder. CoRR abs/1907.10185 (2019) - [i9]Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang:
A Comparative Study on Transformer vs RNN in Speech Applications. CoRR abs/1909.06317 (2019) - [i8]Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, Xu Tan:
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit. CoRR abs/1910.10909 (2019) - [i7]Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu, Hirokazu Kameoka, Tomoki Toda:
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining. CoRR abs/1912.06813 (2019) - 2018
- [j3]Tomoki Hayashi, Masafumi Nishida, Norihide Kitaoka, Tomoki Toda, Kazuya Takeda:
Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 101-A(1): 199-210 (2018) - [c23]Koichi Miyazaki, Tomoki Hayashi, Tomoki Toda, Kazuya Takeda:
Connectionist Temporal Classification-based Sound Event Encoder for Converting Sound Events into Onomatopoeic Representations. EUSIPCO 2018: 852-856 - [c22]Tomoki Hayashi, Tatsuya Komatsu, Reishi Kondo, Tomoki Toda, Kazuya Takeda:
Anomalous Sound Event Detection Based on WaveNet. EUSIPCO 2018: 2494-2498 - [c21]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda:
Multi-Head Decoder for End-to-End Speech Recognition. INTERSPEECH 2018: 801-805 - [c20]Yi-Chiao Wu, Kazuhiro Kobayashi, Tomoki Hayashi, Patrick Lumban Tobing, Tomoki Toda:
Collapsed Speech Segment Detection and Suppression for WaveNet Vocoder. INTERSPEECH 2018: 1988-1992 - [c19]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. INTERSPEECH 2018: 2207-2211 - [c18]Yi-Chiao Wu, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
The NU Non-Parallel Voice Conversion System for the Voice Conversion Challenge 2018. Odyssey 2018: 211-218 - [c17]Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
NU Voice Conversion System for the Voice Conversion Challenge 2018. Odyssey 2018: 219-226 - [c16]Patrick Lumban Tobing, Tomoki Hayashi, Yi-Chiao Wu, Kazuhiro Kobayashi, Tomoki Toda:
An Evaluation of Deep Spectral Mappings and WaveNet Vocoder for Voice Conversion. SLT 2018: 297-303 - [c15]Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for end-to-end ASR. SLT 2018: 426-433 - [i6]Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, Tsubasa Ochiai:
ESPnet: End-to-End Speech Processing Toolkit. CoRR abs/1804.00015 (2018) - [i5]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda:
Multi-Head Decoder for End-to-End Speech Recognition. CoRR abs/1804.08050 (2018) - [i4]Yi-Chiao Wu, Kazuhiro Kobayashi, Tomoki Hayashi, Patrick Lumban Tobing, Tomoki Toda:
Collapsed speech segment detection and suppression for WaveNet vocoder. CoRR abs/1804.11055 (2018) - [i3]Tomoki Hayashi, Shinji Watanabe, Yu Zhang, Tomoki Toda, Takaaki Hori, Ramón Fernandez Astudillo, Kazuya Takeda:
Back-Translation-Style Data Augmentation for End-to-End ASR. CoRR abs/1807.10893 (2018) - [i2]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency training for end-to-end speech recognition. CoRR abs/1811.01690 (2018) - [i1]Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion. CoRR abs/1811.11078 (2018) - 2017
- [j2]Shinji Watanabe, Takaaki Hori, Suyoun Kim, John R. Hershey, Tomoki Hayashi:
Hybrid CTC/Attention Architecture for End-to-End Speech Recognition. IEEE J. Sel. Top. Signal Process. 11(8): 1240-1253 (2017) - [j1]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
Duration-Controlled LSTM for Polyphonic Sound Event Detection. IEEE ACM Trans. Audio Speech Lang. Process. 25(11): 2059-2070 (2017) - [c14]Akira Tamamori, Tomoki Hayashi, Tomoki Toda, Kazuya Takeda:
An investigation of recurrent neural network for daily activity recognition using multi-modal signals. APSIPA 2017: 1334-1340 - [c13]Tomoki Hayashi, Akira Tamamori, Kazuhiro Kobayashi, Kazuya Takeda, Tomoki Toda:
An investigation of multi-speaker training for wavenet vocoder. ASRU 2017: 712-718 - [c12]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection. ICASSP 2017: 766-770 - [c11]Akira Tamamori, Tomoki Hayashi, Kazuhiro Kobayashi, Kazuya Takeda, Tomoki Toda:
Speaker-Dependent WaveNet Vocoder. INTERSPEECH 2017: 1118-1122 - [c10]Kazuhiro Kobayashi, Tomoki Hayashi, Akira Tamamori, Tomoki Toda:
Statistical Voice Conversion with WaveNet-Based Waveform Generation. INTERSPEECH 2017: 1138-1142 - 2016
- [c9]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection. DCASE 2016: 35-39 - 2015
- [c8]Tomoki Hayashi, Masafumi Nishida, Norihide Kitaoka, Kazuya Takeda:
Daily activity recognition based on DNN using environmental sound and acceleration signals. EUSIPCO 2015: 2306-2310 - [c7]Shoko Araki, Tomoki Hayashi, Marc Delcroix, Masakiyo Fujimoto, Kazuya Takeda, Tomohiro Nakatani:
Exploring multi-channel features for denoising-autoencoder-based speech enhancement. ICASSP 2015: 116-120 - 2014
- [c6]Norihide Kitaoka, Tomoki Hayashi, Kazuya Takeda:
Noisy speech recognition using blind spatial subtraction array technique and deep bottleneck features. APSIPA 2014: 1-5 - 2013
- [c5]Tomomi Hatanaka, Tomoki Hayashi, Keita Suzuki, Hiroaki Sawano, Takeshi Tsuchiya, Kei'ichi Koyanagi:
Dream board: a visualization system by handwriting recognition. SIGGRAPH ASIA Posters 2013: 22 - [c4]Naoki Shimizu, Takumi Yoshida, Tomoki Hayashi, François de Sorbier, Hideo Saito:
Non-rigid Surface Tracking for Virtual Fitting System. VISAPP (2) 2013: 12-18 - 2012
- [c3]Tomoki Hayashi, François de Sorbier, Hideo Saito:
Texture Overlay onto Non-rigid Surface using Commodity Depth Camera. VISAPP (2) 2012: 66-71 - 2011
- [c2]Tomoki Hayashi, Benjamin Raynal, Vincent Nozick, Hideo Saito:
Skeleton Features Distribution for 3D Object Retrieval. MVA 2011: 377-380 - 2010
- [c1]Tomoki Hayashi, Hideaki Uchiyama, Julien Pilet, Hideo Saito:
An Augmented Reality Setup with an Omnidirectional Camera Based on Multiple Object Detection. ICPR 2010: 3171-3174
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:24 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint