Follow
Haoyu Lu
Title
Cited by
Cited by
Year
Towards artificial general intelligence via a multimodal foundation model
N Fei, Z Lu, Y Gao, G Yang, Y Huo, J Wen, H Lu, R Song, X Gao, T Xiang, ...
Nature Communications 13 (1), 3094, 2022
2592022
Deepseek LLM: Scaling open-source language models with longtermism
X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ...
arXiv preprint arXiv:2401.02954, 2024
2062024
Deepseek-VL: Towards Real-world Vision-language Understanding
H Lu, W Liu, B Zhang, B Wang, K Dong, B Liu, J Sun, T Ren, Z Li, Y Sun, ...
arXiv preprint arXiv:2403.05525, 2024
2012024
WenLan: Bridging vision and language by large-scale multi-modal pre-training
Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ...
arXiv preprint arXiv:2103.06561, 2021
183*2021
COTS: Collaborative two-stream vision-language pre-training model for cross-modal retrieval
H Lu, N Fei, Y Huo, Y Gao, Z Lu, JR Wen
Proceedings of the IEEE/CVF conference on computer Vision and pattern …, 2022
802022
VDT: General-purpose Video Diffusion Transformers via Mask Modeling
H Lu, G Yang, N Fei, Y Huo, Z Lu, P Luo, M Ding
The Twelfth International Conference on Learning Representations, 2024
59*2024
Uniadapter: Unified parameter-efficient transfer learning for cross-modal modeling
H Lu, Y Huo, G Yang, Z Lu, W Zhan, M Tomizuka, M Ding
The Twelfth International Conference on Learning Representations, 2024
302024
Self-supervised video representation learning with constrained spatiotemporal jigsaw
Y Huo, M Ding, H Lu, Z Lu, T Xiang, JR Wen, Z Huang, J Jiang, S Zhang, ...
222021
Learning versatile neural architectures by propagating network codes
M Ding, Y Huo, H Lu, L Yang, Z Wang, Z Lu, J Wang, P Luo
arXiv preprint arXiv:2103.13253, 2021
162021
LGDN: Language-Guided Denoising Network for Video-Language Modeling
H Lu, M Ding, N Fei, Y Huo, Z Lu
Advances in Neural Information Processing Systems, 2022, 2022
122022
Compressed video contrastive learning
Y Huo, M Ding, H Lu, N Fei, Z Lu, JR Wen, P Luo
Advances in Neural Information Processing Systems 34, 14176-14187, 2021
102021
Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs
Z Zhao, H Lu, Y Huo, Y Du, T Yue, L Guo, B Wang, W Chen, J Liu
arXiv preprint arXiv:2406.09367, 2024
82024
Towards Event-oriented Long Video Understanding
Y Du, K Zhou, Y Huo, Y Li, WX Zhao, H Lu, Z Zhao, B Wang, W Chen, ...
arXiv preprint arXiv:2406.14129, 2024
62024
Bmu-moco: Bidirectional momentum update for continual video-language modeling
Y Gao, N Fei, H Lu, Z Lu, H Jiang, Y Li, Z Cao
Advances in Neural Information Processing Systems 35, 22699-22712, 2022
52022
Kimi k1. 5: Scaling reinforcement learning with llms
K Team, A Du, B Gao, B Xing, C Jiang, C Chen, C Li, C Xiao, C Du, C Liao, ...
arXiv preprint arXiv:2501.12599, 2025
22025
Exploring the design space of visual context representation in video mllms
Y Du, Y Huo, K Zhou, Z Zhao, H Lu, H Huang, WX Zhao, B Wang, W Chen, ...
arXiv preprint arXiv:2410.13694, 2024
12024
Beyond Filtering: Adaptive Image-Text Quality Enhancement for MLLM Pretraining
H Huang, Y Huo, Z Zhao, H Lu, S Wu, B Wang, Q Liu, W Chen, L Wang
arXiv preprint arXiv:2410.16166, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–17