关注
Ji Zhang
Ji Zhang
Alibaba Group
在 alibaba-inc.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
mplug-owl: Modularization empowers large language models with multimodality
Q Ye, H Xu, G Xu, J Ye, M Yan, Y Zhou, J Wang, A Hu, P Shi, Y Shi, C Li, ...
arXiv preprint arXiv:2304.14178, 2023
8952023
mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration
Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu, Q Qian, J Zhang, F Huang
Proceedings of the ieee/cvf conference on computer vision and pattern …, 2024
3812024
X-clip: End-to-end multi-grained contrastive learning for video-text retrieval
Y Ma, G Xu, X Sun, M Yan, J Zhang, R Ji
Proceedings of the 30th ACM international conference on multimedia, 638-647, 2022
2812022
mplug: Effective and efficient vision-language learning by cross-modal skip-connections
C Li, H Xu, J Tian, W Wang, M Yan, B Bi, J Ye, H Chen, G Xu, Z Cao, ...
arXiv preprint arXiv:2205.12005, 2022
1392022
mplug-2: A modularized multi-modal foundation model across text, image and video
H Xu, Q Ye, M Yan, Y Shi, J Ye, Y Xu, C Li, B Bi, Q Qian, W Wang, G Xu, ...
International Conference on Machine Learning, 38728-38748, 2023
1382023
Ureader: Universal ocr-free visually-situated language understanding with multimodal large language model
J Ye, A Hu, H Xu, Q Ye, M Yan, G Xu, C Li, J Tian, Q Qian, J Zhang, Q Jin, ...
arXiv preprint arXiv:2310.05126, 2023
1322023
Evaluation and analysis of hallucination in large vision-language models
J Wang, Y Zhou, G Xu, P Shi, C Zhao, H Xu, Q Ye, M Yan, J Zhang, J Zhu, ...
arXiv preprint arXiv:2308.15126, 2023
1212023
mplug-docowl: Modularized multimodal large language model for document understanding
J Ye, A Hu, H Xu, Q Ye, M Yan, Y Dan, C Zhao, G Xu, C Li, J Tian, Q Qi, ...
arXiv preprint arXiv:2307.02499, 2023
1202023
Semi-autoregressive neural machine translation
C Wang, J Zhang, H Chen
arXiv preprint arXiv:1808.08583, 2018
1002018
mplug-docowl 1.5: Unified structure learning for ocr-free document understanding
A Hu, H Xu, J Ye, M Yan, L Zhang, B Zhang, C Li, J Zhang, Q Jin, F Huang, ...
arXiv preprint arXiv:2403.12895, 2024
972024
Mobile-agent: Autonomous multi-modal mobile device agent with visual perception
J Wang, H Xu, J Ye, M Yan, W Shen, J Zhang, F Huang, J Sang
arXiv preprint arXiv:2401.16158, 2024
942024
Hallucination augmented contrastive learning for multimodal large language model
C Jiang, H Xu, M Dong, J Chen, W Ye, M Yan, Q Ye, J Zhang, F Huang, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
912024
An llm-free multi-dimensional benchmark for mllms hallucination evaluation
J Wang, Y Wang, G Xu, J Zhang, Y Gu, H Jia, M Yan, J Zhang, J Sang
CoRR, 2023
912023
Hitea: Hierarchical temporal-aware video-language pre-training
Q Ye, G Xu, M Yan, H Xu, Q Qian, J Zhang, F Huang
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
822023
Shifting more attention to visual backbone: Query-modulated refinement networks for end-to-end visual grounding
J Ye, J Tian, M Yan, X Yang, X Wang, J Zhang, L He, X Lin
proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
792022
AliMeKG: Domain knowledge graph construction and application in e-commerce
FL Li, H Chen, G Xu, T Qiu, F Ji, J Zhang, H Chen
Proceedings of the 29th ACM International Conference on Information …, 2020
772020
mplug-owl3: Towards long image-sequence understanding in multi-modal large language models
J Ye, H Xu, H Liu, A Hu, M Yan, Q Qian, J Zhang, F Huang, J Zhou
The Thirteenth International Conference on Learning Representations, 2024
742024
Cvalues: Measuring the values of chinese large language models from safety to responsibility
G Xu, J Liu, M Yan, H Xu, J Si, Z Zhou, P Yi, X Gao, J Sang, R Zhang, ...
arXiv preprint arXiv:2307.09705, 2023
672023
Rosita: Enhancing vision-and-language semantic alignments via cross-and intra-modal knowledge integration
Y Cui, Z Yu, C Wang, Z Zhao, J Zhang, M Wang, J Yu
Proceedings of the 29th ACM International Conference on Multimedia, 797-806, 2021
642021
A deep cascade model for multi-document reading comprehension
M Yan, J Xia, C Wu, B Bi, Z Zhao, J Zhang, L Si, R Wang, W Wang, ...
Proceedings of the AAAI conference on artificial intelligence 33 (01), 7354-7361, 2019
622019
系统目前无法执行此操作,请稍后再试。
文章 1–20