VOS: Learning What You Don't Know by Virtual Outlier Synthesis X Du, Z Wang, M Cai, Y Li Proceedings of the International Conference on Learning Representations 1 (4), 8, 2022 | 209 | 2022 |
Masked Discrimination for Self-Supervised Learning on Point Clouds H Liu, M Cai, YJ Lee Proceedings of the European Conference on Computer Vision (ECCV), 2022, 2022 | 104 | 2022 |
Frequency domain image translation: More photo-realistic, better identity-preserving M Cai, H Zhang, H Huang, Q Geng, Y Li, G Huang IEEE International Conference on Computer Vision (ICCV), 2021, 13930-13940, 2021 | 63 | 2021 |
Investigating the catastrophic forgetting in multimodal large language models Y Zhai, S Tong, X Li, M Cai, Q Qu, YJ Lee, Y Ma Conference on Parsimony and Learning (CPAL) 2023, 2023 | 41 | 2023 |
Out-of-distribution Detection via Frequency-regularized Generative Models M Cai, Y Li WACV (Spotlight), 2023, 2022 | 20 | 2022 |
A Game-Theoretic Strategy-Aware Interaction Algorithm with Validation on Real Traffic Data L Sun*, M Cai*, W Zhan, M Tomizuka The 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2020 | 14 | 2020 |
Making large multimodal models understand arbitrary visual prompts M Cai, H Liu, SK Mustikovela, GP Meyer, Y Chai, D Park, YJ Lee CVPR 2024, 2024 | 9 | 2024 |
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance Z Huang, A Zhou, Z Lin, M Cai, H Wang, YJ Lee ICCV 2023, 2023 | 4 | 2023 |
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding M Cai, Z Huang, Y Li, H Wang, YJ Lee arXiv preprint arXiv:2306.06094, 2023 | 4 | 2023 |
Investigating the Catastrophic Forgetting in Multimodal Large Language Model Fine-Tuning Y Zhai, S Tong, X Li, M Cai, Q Qu, YJ Lee, Y Ma Conference on Parsimony and Learning, 202-227, 2024 | 3 | 2024 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models Y Shang, M Cai, B Xu, YJ Lee, Y Yan arXiv preprint arXiv:2403.15388, 2024 | | 2024 |
CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples J Zhang*, M Cai*, T Xie, YJ Lee arXiv preprint arXiv:2402.13254, 2024 | | 2024 |
Delving into LLMs’ visual understanding ability using SVG to bridge image and text M Cai, Z Huang, Y Li, H Wang, YJ Lee | | 2023 |
Causal inference can prevent computer vision from falling into black-box deep learning M Cai | | |