ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis J Xue, Y Deng, Y Han, Y Li, J Sun, J Liang 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 13 | 2022 |
M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis J Xue, Y Deng, F Wang, Y Li, Y Gao, J Tao, J Sun, J Liang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 10 | 2023 |
Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation J Xue, Y Deng, Y Gao, Y Li arXiv preprint arXiv:2401.01044, 2024 | 1 | 2024 |
Frame-level emotional state alignment method for speech emotion recognition Q Li, Y Gao, C Wang, Y Deng, J Xue, Y Han, Y Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
Concss: Contrastive-based Context Comprehension for Dialogue-Appropriate Prosody in Conversational Speech Synthesis Y Deng, J Xue, Y Jia, Q Li, Y Han, F Wang, Y Gao, D Ke, Y Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
CMCU-CSS: Enhancing Naturalness via Commonsense-based Multi-modal Context Understanding in Conversational Speech Synthesis Y Deng, J Xue, F Wang, Y Gao, Y Li Proceedings of the 31st ACM International Conference on Multimedia, 6081-6089, 2023 | | 2023 |
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis D Ke, Y Deng, Y Jia, J Xue, Q Luo, Y Li, J Sun, J Liang, B Lin 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | | 2022 |
A Keypoint Based Enhancement Method for Audio Driven Free View Talking Head Synthesis Y Han, Y Li, Y Gao, J Xue, S Wang, L Yang 2022 IEEE 24th International Workshop on Multimedia Signal Processing (MMSP …, 2022 | | 2022 |