GiraffeDet: A Heavy-Neck Paradigm for Object Detection Y Jiang, Z Tan, J Wang, X Sun, M Lin, H Li International Conference on Learning Representations (ICLR 2022), 2022 | 146* | 2022 |
Entroformer: A Transformer-based Entropy Model for Learned Image Compression Y Qian, M Lin, X Sun, Z Tan, R Jin International Conference on Learning Representations (ICLR 2022), 2022 | 127 | 2022 |
Learning Accurate Entropy Model with Global Reference for Image Compression Y Qian, Z Tan, X Sun, M Lin, D Li, Z Sun, H Li, R Jin ICLR 2021, 2020 | 72 | 2020 |
Learning to rank proposals for object detection Z Tan, X Nie, Q Qian, N Li, H Li Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 61 | 2019 |
Image co-saliency detection by propagating superpixel affinities Z Tan, L Wan, W Feng, CM Pun 2013 IEEE International Conference on Acoustics, Speech and Signal …, 2013 | 45 | 2013 |
Mae-det: Revisiting maximum entropy principle in zero-shot nas for efficient object detection Z Sun, M Lin, X Sun, Z Tan, H Li, R Jin International Conference on Machine Learning (ICML 2022), 20810-20826, 2021 | 43* | 2021 |
Interpolation variable rate image compression Z Sun, Z Tan, X Sun, F Zhang, Y Qian, D Li, H Li Proceedings of the 29th ACM International Conference on Multimedia, 5574-5582, 2021 | 16 | 2021 |
Spatiotemporal entropy model is all you need for learned video compression Z Sun, Z Tan, X Sun, F Zhang, D Li, Y Qian, H Li arXiv preprint arXiv:2104.06083, 2021 | 16 | 2021 |
Ovo: Open-vocabulary occupancy Z Tan, Z Dong, C Zhang, W Zhang, H Ji, H Li arXiv preprint arXiv:2305.16133, 2023 | 12 | 2023 |
Vidgen-1m: A large-scale dataset for text-to-video generation Z Tan, X Yang, L Qin, H Li arXiv preprint arXiv:2408.02629, 2024 | 8 | 2024 |
EvalAlign: Evaluating Text-to-Image Models through Precision Alignment of Multimodal Large Models with Supervised Fine-Tuning to Human Annotations Z Tan, X Yang, L Qin, M Yang, C Zhang, H Li arXiv e-prints, arXiv: 2406.16562, 2024 | 6* | 2024 |
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation Z Tan, M Yang, L Qin, H Yang, Y Qian, Q Zhou, C Zhang, H Li European Conference on Computer Vision, 472-489, 2024 | 2 | 2024 |
Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation Junyan Wang, Zhenhong Sun, Zhiyu Tan, Xuanbai Chen, Weihua Chen, Hao Li ... IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024), 2024 | 2* | 2024 |
Jmpnet: Joint motion prediction for learning-based video compression D Li, Z Sun, Z Tan, X Sun, F Zhang, Y Qian, H Li ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 2 | 2022 |
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Y Wang, Z Tan, J Wang, X Yang, C Jin, H Li arXiv preprint arXiv:2412.04814, 2024 | | 2024 |
ReToMe-VA: Recursive Token Merging for Video Diffusion-based Unrestricted Adversarial Attack Z Gao, K Chen, Z Wei, T Mou, J Chen, Z Tan, H Li, YG Jiang Proceedings of the 32nd ACM International Conference on Multimedia, 4485-4494, 2024 | | 2024 |
EGGen: Image Generation with Multi-entity Prior Learning through Entity Guidance Z Sun, J Wang, Z Tan, D Dong, H Ma, H Li, D Gong ACM Multimedia 2024, 2024 | | 2024 |