Shaohuai Shi
Cited by
Cited by
Benchmarking state-of-the-art deep learning software tools
S Shi, Q Wang, P Xu, X Chu
2016 7th International Conference on Cloud Computing and Big Data (CCBD), 99-104, 2016
Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes
X Jia, S Song, W He, Y Wang, H Rong, F Zhou, L Xie, Z Guo, Y Yang, L Yu, ...
NeurIPS Workshop on Systems for ML and Open Source Software, 2018
Performance modeling and evaluation of distributed deep learning frameworks on gpus
S Shi, Q Wang, X Chu
2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th …, 2018
A Distributed Synchronous SGD Algorithm with Global Top- Sparsification for Low Bandwidth Networks
S Shi, Q Wang, K Zhao, Z Tang, Y Wang, X Huang, X Chu
IEEE ICDCS 2019, 2019
MG-WFBP: Efficient data communication for distributed synchronous SGD algorithms
S Shi, X Chu, B Li
IEEE INFOCOM 2019-IEEE International Conference on Computer Communications …, 2019
Communication-efficient distributed deep learning: A comprehensive survey
Z Tang, S Shi, X Chu, W Wang, B Li
arXiv preprint arXiv:2003.06307, 2020
Performance evaluation of deep learning tools in docker containers
P Xu, S Shi, X Chu
2017 3rd International Conference on Big Data Computing and Communications …, 2017
FADNet: A Fast and Accurate Network for Disparity Estimation
Q Wang, S Shi, S Zheng, K Zhao, X Chu
International Conference on Robotics and Automation (ICRA) 2020, 2020
A Convergence Analysis of Distributed SGD with Communication-Efficient Gradient Sparsification
S Shi, K Zhao, Q Wang, Z Tang, X Chu
IJCAI, 3411-3417, 2019
Speeding up convolutional neural networks by exploiting the sparsity of rectifier units
S Shi, X Chu
arXiv preprint arXiv:1704.07724, 2017
Understanding top-k sparsification in distributed deep learning
S Shi, X Chu, KC Cheung, S See
arXiv preprint arXiv:1911.08772, 2019
Benchmarking the performance and energy efficiency of AI accelerators for AI training
Y Wang, Q Wang, S Shi, X He, Z Tang, K Zhao, X Chu
2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet …, 2020
Communication-efficient distributed deep learning with merged gradient sparsification on gpus
S Shi, Q Wang, X Chu, B Li, Y Qin, R Liu, X Zhao
IEEE INFOCOM 2020-IEEE International Conference on Computer Communications, 2020
Benchmarking deep learning models and automated model design for COVID-19 detection with chest CT scans
X He, S Wang, S Shi, X Chu, J Tang, X Liu, C Yan, J Zhang, G Ding
MedRxiv, 2020.06. 08.20125963, 2021
The GPU-based string matching system in advanced AC algorithm
J Peng, H Chen, S Shi
2010 10th IEEE International Conference on Computer and Information …, 2010
Communication-efficient decentralized learning with sparsification and adaptive peer selection
Z Tang, S Shi, X Chu
2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020
A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
S Shi, Q Wang, X Chu, B Li
2018 IEEE 24rd International Conference on Parallel and Distributed Systems …, 2018
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
S Shi, X Zhou, S Song, X Wang, Z Zhu, X Huang, X Jiang, F Zhou, Z Guo, ...
Fourth Conference on Machine Learning and Systems (MLSys 2021), 2021
Automated Model Design and Benchmarking of 3D Deep Learning Models for COVID-19 Detection with Chest CT Scans
X He, S Wang, X Chu, S Shi, J Tang, X Liu, C Yan, J Zhang, G Ding
AAAI 2021, 2021
A Quantitative Survey of Communication Optimizations in Distributed Deep Learning
S Shi, Z Tang, X Chu, C Liu, W Wang, B Li
IEEE Network, 2020
The system can't perform the operation now. Try again later.
Articles 1–20