Video-LLaVA: Learning United Visual Representation by Alignment Before Projection B Lin, Y Ye, B Zhu, M Ning, P Jin, L Yuan EMNLP 2024, 2023 | 324 | 2023 |
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models B Lin, Z Tang, Y Ye, J Cui, B Zhu, P Jin, J Zhang, M Ning, L Yuan arXiv preprint arXiv:2401.15947, 2024 | 129 | 2024 |
Languagebind: Extending video-language pretraining to n-modality by language-based semantic alignment B Zhu*, B Lin*, M Ning, Y Yan, J Cui, HF Wang, Y Pang, W Jiang, J Zhang, ... ICLR 2024, 2023 | 120 | 2023 |
Sharegpt4video: Improving video understanding and generation with better captions L Chen, X Wei, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, B Lin, ... NeurIPS 2024, 2024 | 52 | 2024 |
Video-bench: A comprehensive benchmark and toolkit for evaluating video-based large language models M Ning, B Zhu, Y Xie, B Lin, J Cui, L Yuan, D Chen, L Yuan arXiv preprint arXiv:2311.16103, 2023 | 31 | 2023 |
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators S Yuan, J Huang, Y Shi, Y Xu, R Zhu, B Lin, X Cheng, L Yuan, J Luo arXiv preprint arXiv:2404.05014, 2024 | 24 | 2024 |
BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis Z Qiu, L Yuan, CA Lian, B Lin, J Chen, R Mu, X Qiao, L Zhang, Z Xu, L Fan, ... Nature Communications 15 (1), 2179, 2024 | 6 | 2024 |
LLMBind: A unified modality-task integration framework B Zhu, P Jin, M Ning, B Lin, J Huang, Q Song, M Pan, L Yuan arXiv preprint arXiv:2402.14891, 2024 | 6 | 2024 |
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model L Chen, Z Li, B Lin, B Zhu, Q Wang, S Yuan, X Zhou, X Cheng, L Yuan arXiv preprint arXiv:2409.01199, 2024 | 4 | 2024 |
Cycle3d: High-quality and consistent image-to-3d generation via generation-reconstruction cycle Z Tang, J Zhang, X Cheng, W Yu, C Feng, Y Pang, B Lin, L Yuan arXiv preprint arXiv:2407.19548, 2024 | 4 | 2024 |
UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark Z Zhou, Q Wang, B Lin, Y Su, R Chen, X Tao, A Zheng, L Yuan, P Wan, ... arXiv preprint arXiv:2404.09619, 2024 | 4 | 2024 |
Open-Sora Plan: Open-Source Large Video Generation Model B Lin, Y Ge, X Cheng, Z Li, B Zhu, S Wang, X He, Y Ye, S Yuan, L Chen, ... arXiv preprint arXiv:2412.00131, 2024 | | 2024 |
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model Z Li*, B Lin*, Y Ye, L Chen, X Cheng, S Yuan, L Yuan arXiv preprint arXiv:2411.17459, 2024 | | 2024 |