关注
Zhan Tong
Zhan Tong
在 kuleuven.be 的电子邮件经过验证
标题
引用次数
引用次数
年份
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Z Tong, Y Song, J Wang, L Wang
36th Conference on Neural Information Processing Systems (NeurIPS), 2022
10372022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
S Chen, C Ge, Z Tong, J Wang, Y Song, J Wang, P Luo
36th Conference on Neural Information Processing Systems (NeurIPS), 2022
5502022
TDN: Temporal Difference Networks for Efficient Action Recognition
L Wang, Z Tong, B Ji, G Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1895-1904, 2021
4822021
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
L Wang, B Huang, Z Zhao, Z Tong, Y He, Y Wang, Y Wang, Y Qiao
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
3322023
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations
Y Liang, C Ge, Z Tong, Y Song, J Wang, P Xie
International Conference on Learning Representations (ICLR), 2022
3122022
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
MMA Contributors
https://github.com/open-mmlab/mmaction2, 2020
2022020
MGSampler: An Explainable Sampling Strategy for Video Action Recognition
Y Zhi, Z Tong, L Wang, G Wu
IEEE/CVF International Conference on Computer Vision (ICCV), 1513-1522, 2021
782021
Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
C Ge, J Wang, Z Tong, S Chen, Y Song, P Luo
International Conference on Learning Representations (ICLR), 2023
302023
Advancing Vision Transformers with Group-Mix Attention
C Ge, X Ding, Z Tong, L Yuan, J Wang, Y Song, P Luo
arXiv preprint arXiv:2311.15157, 2023
162023
Efficient Video Action Detection with Token Dropout and Context Refinement
L Chen, Z Tong, Y Song, G Wu, L Wang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
162023
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale
Z Zeng, Z Tong, X Liu, B Chen, ST Xia, Y Ge
arXiv preprint arXiv:2305.14173, 2023
82023
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens
Z Gao, Z Tong, L Wang, MZ Shou
International Conference on Learning Representations (ICLR), 2024
72024
CycleACR: Cycle Modeling of Actor-Context Relations for Video Action Detection
L Chen, Z Tong, Y Song, G Wu, L Wang
arXiv preprint arXiv:2303.16118, 2023
52023
Contextual AD Narration with Interleaved Multimodal Sequence
H Wang, Z Tong, K Zheng, Y Shen, L Wang
arXiv preprint arXiv:2403.12922, 2024
32024
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification
Q Liu, K Zheng, W Wu, Z Tong, Y Liu, W Chen, Z Wang, Y Shen
arXiv preprint arXiv:2312.14149, 2023
32023
Bootstrapping SparseFormers from Vision Foundation Models
Z Gao, Z Tong, KQ Lin, J Chen, MZ Shou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
2024
SpeedAug: A Simple Co-Augmentation Method for Unsupervised Audio-Visual Pre-training
J Wang, J Jiao, Y Song, S James, Z Tong, C Ge, P Abbeel, YH Liu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Sight …, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–17