Xiaohua Zhai

Cited by

	All	Since 2019
Citations	54346	53984
h-index	40	38
i10-index	50	48

22000

11000

5500

16500

201920202021202220232024303 727 3746 11447 21025 16640

Public access

View all

0 articles

1 article

available

not available

Based on funding mandates

Co-authors

Lucas BeyerGoogle DeepMind, Google Brain, RWTH AachenVerified email at google.com
Neil HoulsbyGoogleVerified email at google.com
Alexander KolesnikovResearch Scientist, Google DeepmindVerified email at google.com
Sylvain GellyGoogle Brain ZurichVerified email at m4x.org
Alexey DosovitskiyInceptiveVerified email at inceptive.team
Matthias MindererSenior Research Scientist, Google DeepMindVerified email at google.com
Mostafa DehghaniResearch Scientist, Google DeepMindVerified email at google.com
Mario LučićResearch Scientist, Google DeepMindVerified email at google.com
Andreas Peter SteinerSoftware engineer, Google ResearchVerified email at google.com
Daniel KeysersGoogleVerified email at google.com
Jessica YungGoogle BrainVerified email at google.com
Yuxin PengPeking UniversityVerified email at pku.edu.cn
Michael TschannenGoogle DeepMindVerified email at google.com
Joan PuigcerverGoogle BrainVerified email at google.com
Xiao WangGoogle DeepMindVerified email at google.com
André Susano PintoGoogle DeepMindVerified email at google.com
Ilya TolstikhinGoogle DeepmindVerified email at google.com
Basil MustafaGoogle DeepmindVerified email at google.com
Xiao JianguoWangXuan Institute of Computer Technology, Peking UnivsityVerified email at pku.edu.cn
Maxim NeumannGoogleVerified email at google.com

Xiaohua Zhai

Research Scientist, Google Deepmind

Verified email at google.com - Homepage

Representation Learning Vision and Language Computer Vision


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai*, ... International Conference on Learning Representations (ICLR), 2021	40119	2021
MLP-Mixer: An all-MLP Architecture for Vision I Tolstikhin, N Houlsby, A Kolesnikov, L Beyer, X Zhai, T Unterthiner, ... Advances in Neural Information Processing Systems (NeurIPS), 2021	2415	2021
Big Transfer (BiT): General Visual Representation Learning A Kolesnikov, L Beyer, X Zhai*, J Puigcerver, J Yung, S Gelly, ... European Conference on Computer Vision (ECCV), 2020	1333	2020
S4l: Self-supervised semi-supervised learning X Zhai, A Oliver, A Kolesnikov, L Beyer, *equal contribution International Conference on Computer Vision (ICCV), 1476-1485, 2019	1000	2019
Scaling Vision Transformers X Zhai, A Kolesnikov, N Houlsby, L Beyer, equal contribution Computer Vision and Pattern Recognition (CVPR), 2022	988	2022
Revisiting Self-Supervised Visual Representation Learning A Kolesnikov, X Zhai, L Beyer, equal contribution Computer Vision and Pattern Recognition (CVPR), 2019	842	2019
Underspecification Presents Challenges for Credibility in Modern Machine Learning A D'Amour, K Heller, D Moldovan*, B Adlam, B Alipanahi, A Beutel, ... Journal of Machine Learning Research (JMLR), 2020	717	2020
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers A Steiner, A Kolesnikov, X Zhai, R Wightman, J Uszkoreit, L Beyer, ... Transactions on Machine Learning Research (TMLR), 2022	562	2022
Pali: A jointly-scaled multilingual language-image model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... International Conference on Learning Representations (ICLR), 2022	487	2022
LiT: Zero-Shot Transfer with Locked-image Text Tuning X Zhai, X Wang, B Mustafa, A Steiner, D Keysers, A Kolesnikov, ... Computer Vision and Pattern Recognition (CVPR), 2022	454	2022
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark X Zhai, J Puigcerver, A Kolesnikov*, P Ruyssen, C Riquelme, M Lucic, ... arXiv preprint arXiv:1910.04867, 2019	366*	2019
Self-Supervised GANs via Auxiliary Rotation Loss T Chen, X Zhai, M Ritter, M Lucic, N Houlsby Computer Vision and Pattern Recognition (CVPR), 12154-12163, 2019	357	2019
Are we done with ImageNet? L Beyer, OJ Hénaff, A Kolesnikov, X Zhai, A Oord, equal contribution arXiv preprint arXiv:2006.07159, 2020	353	2020
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... International Conference on Machine Learning (ICML), 7480-7512, 2023	341	2023
Simple Open-Vocabulary Object Detection with Vision Transformers M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ... European Conference on Computer Vision (ECCV), 2022	332*	2022
& Houlsby, N.(2020). An image is worth 16x16 words: Transformers for image recognition at scale A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ... arXiv preprint arXiv:2010.11929, 2010	323	2010
Learning cross-media joint representation with sparse and semisupervised regularization X Zhai, Y Peng, J Xiao IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 24 (6 …, 2013	296	2013
Revisiting the Calibration of Modern Neural Networks M Minderer, J Djolonga, R Romijnders, F Hubis, X Zhai, N Houlsby, ... Advances in Neural Information Processing Systems (NeurIPS), 2021	285	2021
Knowledge distillation: A good teacher is patient and consistent L Beyer, X Zhai, A Royer, L Markeeva, R Anil, A Kolesnikov*, ... Computer Vision and Pattern Recognition (CVPR), 2022	262	2022
A simple single-scale vision transformer for object localization and instance segmentation W Chen, X Du, F Yang, L Beyer, X Zhai, TY Lin, H Chen, J Li, X Song, ... European Conference on Computer Vision (ECCV), 2022	203	2022

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors