Follow
Yuancheng Wang
Yuancheng Wang
The Chinese University of Hong Kong, Shenzhen
Verified email at link.cuhk.edu.cn - Homepage
Title
Cited by
Cited by
Year
Naturalspeech 3: Zero-shot speech synthesis with factorized codec and diffusion models
Z Ju*, Y Wang*, K Shen*, X Tan*, D Xin, D Yang, Y Liu, Y Leng, K Song, ...
International Conference on Machine Learning (ICML 2024), 2024
1092024
Audit: Audio editing by following instructions with latent diffusion models
Y Wang, Z Ju, X Tan, L He, Z Wu, J Bian
Advances in Neural Information Processing Systems (NeurIPS 2023), 2023
442023
Automated testing of image captioning systems
B Yu, Z Zhong, X Qin, J Yao, Y Wang, P He
Proceedings of the 31st ACM SIGSOFT International Symposium on Software …, 2022
262022
Amphion: An open-source audio, music and speech generation toolkit
X Zhang*, L Xue*, Y Gu*, Y Wang*, J Li*, H He, C Wang, S Liu, X Chen, ...
IEEE Spoken Language Technology Workshop (SLT 2024), 2023
212023
Foleycrafter: Bring silent videos to life with lifelike and synchronized sounds
Y Zhang, Y Gu, Y Zeng, Z Xing, Y Wang, Z Wu, K Chen
arXiv preprint arXiv:2407.01494, 2024
192024
Rall-e: Robust codec language modeling with chain-of-thought prompting for text-to-speech synthesis
D Xin, X Tan, K Shen, Z Ju, D Yang, Y Wang, S Takamichi, H Saruwatari, ...
arXiv preprint arXiv:2404.03204, 2024
192024
Emilia: An extensive, multilingual, and diverse speech dataset for large-scale speech generation
H He, Z Shang, C Wang, X Li, Y Gu, H Hua, L Liu, C Yang, J Li, P Shi, ...
IEEE Spoken Language Technology Workshop (SLT 2024), 2024
142024
Maskgct: Zero-shot text-to-speech with masked generative codec transformer
Y Wang, H Zhan, L Liu, R Zeng, H Guo, J Zheng, Q Zhang, X Zhang, ...
arXiv preprint arXiv:2409.00750, 2024
82024
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
J Ao*, Y Wang*, X Tian, D Chen, J Zhang, L Lu, Y Wang, H Li, Z Wu
Advances in Neural Information Processing Systems (NeurIPS 2024), 2024
42024
Debatts: Zero-Shot Debating Text-to-Speech Synthesis
Y Huang, Y Wang, J Li, H Guo, H He, S Zhang, Z Wu
arXiv preprint arXiv:2411.06540, 2024
12024
Noro: A Noise-Robust One-shot Voice Conversion System with Hidden Speaker Representation Capabilities
H He, Y Song, Y Wang, H Li, X Zhang, L Wang, G Huang, ES Chng, Z Wu
arXiv preprint arXiv:2411.19770, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–11