Simullr: Simultaneous lip reading transducer with attention-guided adaptive memory Z Lin, Z Zhao, H Li, J Liu, M Zhang, X Zeng, X He Proceedings of the 29th ACM International Conference on Multimedia, 1359-1367, 2021 | 15 | 2021 |
Video-guided curriculum learning for spoken video grounding Y Xia, Z Zhao, S Ye, Y Zhao, H Li, Y Ren Proceedings of the 30th ACM International Conference on Multimedia, 5191-5200, 2022 | 6 | 2022 |
Towards effective multi-modal interchanges in zero-resource sounding object localization Y Zhao, C Zhang, H Huang, H Li, Z Zhao Advances in Neural Information Processing Systems 35, 38089-38102, 2022 | 5 | 2022 |
Date: Domain adaptive product seeker for e-commerce H Li, H Jiang, T Jin, M Li, Y Chen, Z Lin, Y Zhao, Z Zhao Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 4 | 2023 |
HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models W Zhang, T Lin, J Liu, F Shu, H Li, L Zhang, H Wanggui, H Zhou, Z Lv, ... arXiv preprint arXiv:2403.13447, 2024 | 1 | 2024 |
TrainerAgent: Customizable and Efficient Model Training through LLM-Powered Multi-Agent System H Li, H Jiang, T Zhang, Z Yu, A Yin, H Cheng, S Fu, Y Zhang, W He arXiv preprint arXiv:2311.06622, 2023 | 1 | 2023 |
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback W Xiao, Z Huang, L Gan, W He, H Li, Z Yu, H Jiang, F Wu, L Zhu arXiv preprint arXiv:2404.14233, 2024 | | 2024 |
Language Model is a Branch Predictor for Simultaneous Machine Translation A Yin, T Zhong, H Li, S Tang, Z Zhao ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | | 2024 |
Weakly-Supervised Video Moment Retrieval via Regularized Two-Branch Proposal Networks with Erasing Mechanism H Li, Z Zhao, Z Zhang, Z Lin arXiv preprint arXiv:2311.13946, 2023 | | 2023 |