Dynamic demonstrations controller for in-context learning F Zhao, T Pang, Z Wu, Z Ma, S Huang, X Dai | 2 | 2023 |
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model K Cheng, W Song, Z Ma, W Zhu, Z Zhu, J Zhang In Proceedings of the 31th ACM International Conference on Multimedia, 2023 | 2 | 2023 |
Cobra Effect in Reference-Free Image Captioning Metrics Z Ma, C Wang, Y Ouyang, F Zhao, J Zhang, S Huang, J Chen arXiv preprint arXiv:2402.11572, 2024 | 1 | 2024 |
Bounding and Filling: A Fast and Flexible Framework for Image Captioning Z Ma, C Wang, B Huang, Z Zhu, J Zhang NLPCC2023 (best paper), 469-481, 2023 | 1 | 2023 |
Probing Cross-modal Semantics Alignment Capability from the Textual Perspective Z Ma, S Zong, M Pan, J Zhang, S Huang, X Dai, J Chen Findings of the Association for Computational Linguistics: EMNLP 2022, 2022 | 1 | 2022 |
ADS-Cap: A Framework for Accurate and Diverse Stylized Captioning with Unpaired Stylistic Corpora K Cheng, Z Ma, S Zong, J Zhang, X Dai, J Chen CCF International Conference on Natural Language Processing and Chinese …, 2022 | 1 | 2022 |
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui, W Tong, K Hu, J Luo, Z Ma, ... arXiv preprint arXiv:2404.16821, 2024 | | 2024 |
MixRED: A Mix-lingual Relation Extraction Dataset L Kong, Y Chu, Z Ma, J Zhang, L He, J Chen arXiv preprint arXiv:2403.15696, 2024 | | 2024 |
Probing Commonsense Reasoning Capability of Text-to-Image Generative Models via Non-visual Description M Pan, J Li, M Yu, Z Ma, K Cheng, J Zhang, J Chen arXiv preprint arXiv:2312.07294, 2023 | | 2023 |
Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models Z Ma, M Pan, W Wu, K Cheng, J Zhang, S Huang, J Chen In Proceedings of the 31th ACM International Conference on Multimedia, 2023 | | 2023 |