PointCLIP: Point Cloud Understanding by CLIP R Zhang*, Z Guo*, W Zhang, K Li, X Miao, B Cui, Y Qiao, P Gao, H Li CVPR 2022, 8552-8562, 2022 | 428 | 2022 |
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training R Zhang, Z Guo, P Gao, R Fang, B Zhao, D Wang, Y Qiao, H Li NeurIPS 2022, 2022 | 244 | 2022 |
Personalize Segment Anything Model with One Shot R Zhang, Z Jiang, Z Guo, S Yan, J Pan, H Dong, P Gao, H Li ICLR 2024, 2023 | 170 | 2023 |
MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection R Zhang, H Qiu, T Wang, Z Guo, Z Cui, Y Qiao, H Li, P Gao ICCV 2023, 9155-9166, 2023 | 136 | 2023 |
PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning X Zhu, R Zhang, B He, Z Guo, Z Zeng, Z Qin, S Zhang, P Gao ICCV 2023, 2023 | 113 | 2023 |
ImageBind-LLM: Multi-modality Instruction Tuning J Han, R Zhang, W Shao, P Gao, P Xu, H Xiao, K Zhang, C Liu, S Wen, ... arXiv preprint arXiv:2309.03905, 2023 | 99 | 2023 |
CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention Z Guo, R Zhang, L Qiu, X Ma, X Miao, X He, B Cui AAAI 2023 Oral, 2022 | 99 | 2022 |
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following Z Guo, R Zhang, X Zhu, Y Tang, X Ma, J Han, K Chen, P Gao, X Li, H Li, ... arXiv preprint arXiv:2309.00615, 2023 | 93 | 2023 |
Parameter is Not All You Need: Starting from Non-parametric Networks for 3D Point Cloud Analysis R Zhang, L Wang, Z Guo, Y Wang, P Gao, H Li, J Shi CVPR 2023, 2023 | 92* | 2023 |
Mathverse: Does your multi-modal llm truly see the diagrams in visual math problems? R Zhang, D Jiang, Y Zhang, H Lin, Z Guo, P Qiu, A Zhou, P Lu, KW Chang, ... ECCV 2024, 2024 | 82 | 2024 |
Can Language Understand Depth? R Zhang, Z Zeng, Z Guo, Y Li ACM MM 2022, 6868-6874, 2022 | 69 | 2022 |
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance Z Guo, Y Tang, R Zhang, D Wang, Z Wang, B Zhao, X Li ICCV 2023, 15372-15383, 2023 | 50* | 2023 |
VT-CLIP: Enhancing Vision-Language Models with Visual-guided Texts L Qiu, R Zhang, Z Guo, Z Zeng, Y Li, G Zhang arXiv preprint arXiv:2112.02399, 2021 | 49 | 2021 |
Lidar-LLM: Exploring the Potential of Large Language Models for 3D Lidar Understanding S Yang, J Liu, R Zhang, M Pan, Z Guo, X Li, Z Chen, P Gao, Y Guo, ... AAAI 2025, 2023 | 47 | 2023 |
Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud Pre-training Z Guo, R Zhang, L Qiu, X Li, PA Heng IJCAI 2023, 2023 | 46 | 2023 |
DS-Point: A Dual-Scale 3D Framework for Point Cloud Understanding R Zhang*, Z Zeng*, Z Guo*, B Chen, G Zhang, X Liu SMC 2023, 5046-5051, 2023 | 31* | 2023 |
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation S Yan*, R Zhang*, Z Guo*, W Chen, W Zhang, H Li, Y Qiao, Z He, P Gao AAAI 2024, 2023 | 20 | 2023 |
MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine R Zhang, X Wei, D Jiang, Z Guo, S Li, Y Zhang, C Tong, J Liu, A Zhou, ... arXiv preprint arXiv:2407.08739, 2024 | 19* | 2024 |
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis R Zhang, L Wang, Z Guo, J Shi WACV 2023, 1246-1255, 2023 | 18 | 2023 |
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation X Zhu, R Zhang, B He, Z Guo, J Liu, H Xiao, C Fu, H Dong, P Gao CVPR 2024, 2024 | 9* | 2024 |