Shaohuai Shi
Title
Cited by
Cited by
Year
Benchmarking state-of-the-art deep learning software tools
S Shi, Q Wang, P Xu, X Chu
2016 7th International Conference on Cloud Computing and Big Data (CCBD), 99-104, 2016
2922016
Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes
X Jia, S Song, W He, Y Wang, H Rong, F Zhou, L Xie, Z Guo, Y Yang, L Yu, ...
NeurIPS Workshop on Systems for ML and Open Source Software, 2018
2262018
Performance modeling and evaluation of distributed deep learning frameworks on gpus
S Shi, Q Wang, X Chu
2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th …, 2018
542018
A Distributed Synchronous SGD Algorithm with Global Top- Sparsification for Low Bandwidth Networks
S Shi, Q Wang, K Zhao, Z Tang, Y Wang, X Huang, X Chu
IEEE ICDCS 2019, 2019
392019
MG-WFBP: Efficient data communication for distributed synchronous SGD algorithms
S Shi, X Chu, B Li
IEEE INFOCOM 2019-IEEE International Conference on Computer Communications …, 2019
362019
Speeding up convolutional neural networks by exploiting the sparsity of rectifier units
S Shi, X Chu
arXiv preprint arXiv:1704.07724, 2017
302017
The GPU-based string matching system in advanced AC algorithm
J Peng, H Chen, S Shi
2010 10th IEEE International Conference on Computer and Information …, 2010
272010
Performance evaluation of deep learning tools in docker containers
P Xu, S Shi, X Chu
2017 3rd International Conference on Big Data Computing and Communications …, 2017
252017
A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
S Shi, Q Wang, X Chu, B Li
2018 IEEE 24rd International Conference on Parallel and Distributed Systems …, 2018
18*2018
A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification
S Shi, K Zhao, Q Wang, Z Tang, X Chu
IJCAI, 3411-3417, 2019
172019
Understanding top-k sparsification in distributed deep learning
S Shi, X Chu, KC Cheung, S See
arXiv preprint arXiv:1911.08772, 2019
152019
Communication-efficient distributed deep learning: A comprehensive survey
Z Tang, S Shi, X Chu, W Wang, B Li
arXiv preprint arXiv:2003.06307, 2020
142020
Benchmarking the performance and energy efficiency of ai accelerators for ai training
Y Wang, Q Wang, S Shi, X He, Z Tang, K Zhao, X Chu
2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet …, 2020
13*2020
Communication-efficient distributed deep learning with merged gradient sparsification on gpus
S Shi, Q Wang, X Chu, B Li, Y Qin, R Liu, X Zhao
IEEE INFOCOM 2020-IEEE International Conference on Computer Communications, 2020
132020
Benchmarking deep learning models and automated model design for covid-19 detection with chest ct scans
X He, S Wang, S Shi, X Chu, J Tang, X Liu, C Yan, J Zhang, G Ding
medRxiv, 2020
82020
FADNet: A Fast and Accurate Network for Disparity Estimation
Q Wang, S Shi, S Zheng, K Zhao, X Chu
International Conference on Robotics and Automation (ICRA) 2020, 2020
72020
Mixed precision method for gpu-based fft
S Qi, X Wang, S Shi
2011 14th IEEE International Conference on Computational Science and …, 2011
72011
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
S Shi, Z Tang, Q Wang, K Zhao, X Chu
ECAI 2020-European Conference on Artificial Intelligence (ECAI), 2020
52020
Communication-efficient decentralized learning with sparsification and adaptive peer selection
Z Tang, S Shi, X Chu
arXiv preprint arXiv:2002.09692, 2020
52020
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
S Shi, X Zhou, S Song, X Wang, Z Zhu, X Huang, X Jiang, F Zhou, Z Guo, ...
Fourth Conference on Machine Learning and Systems (MLSys 2021), 2021
32021
The system can't perform the operation now. Try again later.
Articles 1–20