Learning deep representation for imbalanced classification C Huang, Y Li, CC Loy, X Tang Proceedings of the IEEE conference on computer vision and pattern …, 2016 | 1250 | 2016 |
Deep imbalanced learning for face recognition and attribute prediction C Huang, Y Li, CC Loy, X Tang IEEE transactions on pattern analysis and machine intelligence 42 (11), 2781 …, 2019 | 391 | 2019 |
Openmmlab pose estimation toolbox and benchmark MMP Contributors | 320 | 2020 |
Dense intrinsic appearance flow for human pose transfer Y Li, C Huang, CC Loy Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 205 | 2019 |
Human attribute recognition by deep hierarchical contexts Y Li, C Huang, CC Loy, X Tang Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The …, 2016 | 198 | 2016 |
Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ... arXiv preprint arXiv:2401.16420, 2024 | 116 | 2024 |
Rtmpose: Real-time multi-person pose estimation based on mmpose T Jiang, P Lu, L Zhang, N Ma, R Han, C Lyu, Y Li, K Chen arXiv preprint arXiv:2303.07399, 2023 | 102 | 2023 |
Internlm2 technical report Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ... arXiv preprint arXiv:2403.17297, 2024 | 93 | 2024 |
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ... arXiv preprint arXiv:2404.06512, 2024 | 52 | 2024 |
Learning to disambiguate by asking discriminative questions Y Li, C Huang, X Tang, C Change Loy Proceedings of the IEEE International Conference on Computer Vision, 3419-3428, 2017 | 29 | 2017 |
OMG-Seg: Is one model good enough for all segmentation? X Li, H Yuan, W Li, H Ding, S Wu, W Zhang, Y Li, K Chen, CC Loy Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 26 | 2024 |
Open-vocabulary SAM: Segment and recognize twenty-thousand classes interactively H Yuan, X Li, C Zhou, Y Li, K Chen, CC Loy arXiv preprint arXiv:2401.02955, 2024 | 18 | 2024 |
An open and comprehensive pipeline for unified object grounding and detection X Zhao, Y Chen, S Xu, X Li, X Wang, Y Li, H Huang arXiv preprint arXiv:2401.02361, 2024 | 14 | 2024 |
Internlm-xcomposer-2.5: A versatile large vision language model supporting long-contextual input and output P Zhang, X Dong, Y Zang, Y Cao, R Qian, L Chen, Q Guo, H Duan, ... arXiv preprint arXiv:2407.03320, 2024 | 13 | 2024 |
Towards language-driven video inpainting via multimodal large language models J Wu, X Li, C Si, S Zhou, J Yang, J Zhang, Y Li, K Chen, Y Tong, Z Liu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 12 | 2024 |
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation P Lu, T Jiang, Y Li, X Li, K Chen, W Yang Proceedings of the IEEE conference on computer vision and pattern recognition, 2024 | 12 | 2024 |
Dst-det: Simple dynamic self-training for open-vocabulary object detection S Xu, X Li, S Wu, W Zhang, Y Li, G Cheng, Y Tong, K Chen, CC Loy arXiv preprint arXiv:2310.01393, 2023 | 9 | 2023 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding X Fang, K Mao, H Duan, X Zhao, Y Li, D Lin, K Chen arXiv preprint arXiv:2406.14515, 2024 | 7 | 2024 |
MotionBooth: Motion-Aware Customized Text-to-Video Generation J Wu, X Li, Y Zeng, J Zhang, Q Zhou, Y Li, Y Tong, K Chen arXiv preprint arXiv:2406.17758, 2024 | 3 | 2024 |
RAP-SAM: Towards Real-Time All-Purpose Segment Anything S Xu, H Yuan, Q Shi, L Qi, J Wang, Y Yang, Y Li, K Chen, Y Tong, ... arXiv preprint arXiv:2401.10228, 2024 | 3 | 2024 |