关注
Yuanhan Zhang
Yuanhan Zhang
PhD Candidate, MMLab@NTU
在 e.ntu.edu.sg 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Mimic-it: Multi-modal in-context instruction tuning
B Li, Y Zhang, L Chen, J Wang, F Pu, J Yang, C Li, Z Liu
arXiv preprint arXiv:2306.05425, 2023
3132023
Celeba-spoof: Large-scale face anti-spoofing dataset with rich annotations
Y Zhang, ZF Yin, Y Li, G Yin, J Yan, J Shao, Z Liu
Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020
1592020
Mmbench: Is your multi-modal model an all-around player?
Y Liu, H Duan, Y Zhang, B Li, S Zhang, W Zhao, Y Yuan, J Wang, C He, ...
arXiv preprint arXiv:2307.06281, 2023
1582023
Neural Prompt Search
Y Zhang, K Zhou, Z Liu
arXiv preprint arXiv:2206.04673, 2022
902022
What makes good examples for visual in-context learning?
Y Zhang, K Zhou, Z Liu
Advances in Neural Information Processing Systems 36, 2024
442024
Octopus: Embodied vision-language programmer from environmental feedback
J Yang, Y Dong, S Liu, B Li, Z Wang, C Jiang, H Tan, J Kang, Y Zhang, ...
arXiv preprint arXiv:2310.08588, 2023
242023
Benchmarking omni-vision representation through the lens of visual realms
Y Zhang, Z Yin, J Shao, Z Liu
European Conference on Computer Vision, 594-611, 2022
192022
Celeba-spoof challenge 2020 on face anti-spoofing: Methods and results
Y Zhang, Z Yin, J Shao, Z Liu, S Yang, Y Xiong, W Xia, Y Xu, M Luo, J Liu, ...
arXiv preprint arXiv:2102.12642, 2021
142021
Learning without forgetting for vision-language models
DW Zhou, Y Zhang, J Ning, HJ Ye, DC Zhan, Z Liu
arXiv preprint arXiv:2305.19270, 2023
132023
Bamboo: Building mega-scale vision dataset continually with human-machine synergy
Y Zhang, Q Sun, Y Zhou, Z He, Z Yin, K Wang, L Sheng, Y Qiao, J Shao, ...
arXiv preprint arXiv:2203.07845, 2022
132022
3d point cloud pre-training with knowledge distillation from 2d images
Y Yao, Y Zhang, Z Yin, J Luo, W Ouyang, X Huang
arXiv preprint arXiv:2212.08974, 2022
72022
On-device domain generalization
K Zhou, Y Zhang, Y Zang, J Yang, CC Loy, Z Liu
arXiv preprint arXiv:2209.07521, 2022
52022
Funqa: Towards surprising video comprehension
B Xie, S Zhang, Z Zhou, B Li, Y Zhang, J Hessel, J Yang, Z Liu
arXiv preprint arXiv:2306.14899, 2023
42023
Robust face anti-spoofing with dual probabilistic modeling
Y Zhang, Y Wu, Z Yin, J Shao, Z Liu
arXiv preprint arXiv:2204.12685, 2022
32022
Multimodal foundation models for zero-shot animal species recognition in camera trap images
Z Fabian, Z Miao, C Li, Y Zhang, Z Liu, A Hernández, A Montes-Rojas, ...
arXiv preprint arXiv:2311.01064, 2023
22023
Otter: A multi-modal model with in-context instruction tuning
B Li, Y Zhang, L Chen, J Wang, J Yang, Z Liu
arXiv preprint arXiv:2305.03726, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–16