Follow
Xiaohua Zhai
Xiaohua Zhai
Research Scientist, Google Deepmind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A Dosovitskiy*, L Beyer*, A Kolesnikov*, D Weissenborn*, X Zhai*, ...
International Conference on Learning Representations (ICLR), 2021
456452021
MLP-Mixer: An all-MLP Architecture for Vision
I Tolstikhin*, N Houlsby*, A Kolesnikov*, L Beyer*, X Zhai, T Unterthiner, ...
Advances in Neural Information Processing Systems (NeurIPS), 2021
26742021
Big Transfer (BiT): General Visual Representation Learning
A Kolesnikov*, L Beyer*, X Zhai*, J Puigcerver, J Yung, S Gelly, ...
European Conference on Computer Vision (ECCV), 2020
14162020
Scaling Vision Transformers
X Zhai*, A Kolesnikov*, N Houlsby, L Beyer*, *equal contribution
Computer Vision and Pattern Recognition (CVPR), 2022
10942022
S4l: Self-supervised semi-supervised learning
X Zhai*, A Oliver*, A Kolesnikov*, L Beyer*, *equal contribution
International Conference on Computer Vision (ICCV), 1476-1485, 2019
10472019
Revisiting Self-Supervised Visual Representation Learning
A Kolesnikov*, X Zhai*, L Beyer*, *equal contribution
Computer Vision and Pattern Recognition (CVPR), 2019
8722019
Underspecification Presents Challenges for Credibility in Modern Machine Learning
A D'Amour*, K Heller*, D Moldovan*, B Adlam, B Alipanahi, A Beutel, ...
Journal of Machine Learning Research (JMLR), 2020
7562020
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
A Steiner*, A Kolesnikov*, X Zhai*, R Wightman, J Uszkoreit, L Beyer*, ...
Transactions on Machine Learning Research (TMLR), 2022
6232022
Pali: A jointly-scaled multilingual language-image model
X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ...
International Conference on Learning Representations (ICLR), 2022
5482022
LiT: Zero-Shot Transfer with Locked-image Text Tuning
X Zhai*, X Wang*, B Mustafa*, A Steiner*, D Keysers, A Kolesnikov, ...
Computer Vision and Pattern Recognition (CVPR), 2022
5032022
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
X Zhai*, J Puigcerver*, A Kolesnikov*, P Ruyssen, C Riquelme, M Lucic, ...
arXiv preprint arXiv:1910.04867, 2019
402*2019
Scaling vision transformers to 22 billion parameters
M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ...
International Conference on Machine Learning (ICML), 7480-7512, 2023
3972023
Simple Open-Vocabulary Object Detection with Vision Transformers
M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ...
European Conference on Computer Vision (ECCV), 2022
389*2022
Are we done with ImageNet?
L Beyer*, OJ Hénaff*, A Kolesnikov*, X Zhai*, A Oord*, *equal contribution
arXiv preprint arXiv:2006.07159, 2020
3892020
& Houlsby, N.(2020). An image is worth 16x16 words: Transformers for image recognition at scale
A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ...
arXiv preprint arXiv:2010.11929, 2010
3742010
Self-Supervised GANs via Auxiliary Rotation Loss
T Chen, X Zhai, M Ritter, M Lucic, N Houlsby
Computer Vision and Pattern Recognition (CVPR), 12154-12163, 2019
3642019
Revisiting the Calibration of Modern Neural Networks
M Minderer, J Djolonga, R Romijnders, F Hubis, X Zhai, N Houlsby, ...
Advances in Neural Information Processing Systems (NeurIPS), 2021
3172021
Learning cross-media joint representation with sparse and semisupervised regularization
X Zhai, Y Peng, J Xiao
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 24 (6 …, 2013
3012013
Sigmoid loss for language image pre-training
X Zhai*, B Mustafa, A Kolesnikov, L Beyer*, *equal contribution
International Conference on Computer Vision (ICCV), 2023
2902023
Knowledge distillation: A good teacher is patient and consistent
L Beyer*, X Zhai*, A Royer*, L Markeeva*, R Anil, A Kolesnikov*, ...
Computer Vision and Pattern Recognition (CVPR), 2022
2892022
The system can't perform the operation now. Try again later.
Articles 1–20