Yinmin Zhong

Cited by

	All	Since 2019
Citations	181	181
h-index	6	6
i10-index	5	5

160

120

2023202429 152

Public access

View all

2 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Xin JinPeking UniversityVerified email at pku.edu.cn
Xuanzhe LiuBoya Distinguished Professor of Computer Science, Peking University, ACM Distinguished ScientistVerified email at pku.edu.cn
Hao ZhangUC San DiegoVerified email at ucsd.edu
Bingyang WuPeking UniversityVerified email at pku.edu.cn
Zhuohan LiUC BerkeleyVerified email at berkeley.edu
Lianmin ZhengUC BerkeleyVerified email at berkeley.edu
Ying ShengPhD student of Stanford UniversityVerified email at stanford.edu
Ion StoicaProfessor of Computer Science, UC BerkeleyVerified email at cs.berkeley.edu
Vincent LiuUniversity of PennsylvaniaVerified email at seas.upenn.edu
Joseph E. GonzalezProfessor of Computer Science, UC BerkeleyVerified email at berkeley.edu
Yanping HuangGoogle BrainVerified email at google.com
Zhifeng ChenGoogle Inc.Verified email at google.com
Shengyu LiuPeking UniversityVerified email at stu.pku.edu.cn
zili zhangpeking universityVerified email at pku.edu.cn
Fan YangMicrosoft ResearchVerified email at microsoft.com
Peng ChengMicrosoft ResearchVerified email at microsoft.com
Zhenhua HANMicrosoft Research AsiaVerified email at microsoft.com
Diandian GuPeking UniversityVerified email at pku.edu.cn
Yihao ZhaoPeking UniversityVerified email at pku.edu.cn
Haibin LinBytedanceVerified email at bytedance.com

Yinmin Zhong

Peking University

Verified email at pku.edu.cn - Homepage

Machine Learning System Distributed System


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023	80	2023
Fast distributed inference serving for large language models B Wu, Y Zhong, Z Zhang, G Huang, X Liu, X Jin arXiv preprint arXiv:2305.05920, 2023	33	2023
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs Z Jiang, H Lin, Y Zhong, Q Huang, Y Chen, Z Zhang, Y Peng, X Li, C Xie, ... 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24), 2024	26	2024
Distserve: Disaggregating prefill and decoding for goodput-optimized large language model serving Y Zhong, S Liu, J Chen, J Hu, Y Zhu, X Liu, X Jin, H Zhang 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24), 2024	20	2024
ElasticFlow: An elastic serverless training platform for distributed deep learning D Gu, Y Zhao, Y Zhong, Y Xiong, Z Han, P Cheng, F Yang, G Huang, X Jin, ... Proceedings of the 28th ACM International Conference on Architectural …, 2023	14	2023
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism B Wu, S Liu, Y Zhong, P Sun, X Liu, X Jin arXiv preprint arXiv:2404.09526, 2024	6	2024
DistMind: Efficient Resource Disaggregation for Deep Learning Workloads X Jin, Z Bai, Z Zhang, Y Zhu, Y Zhong, X Liu IEEE/ACM Transactions on Networking, 2024	2	2024
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion L Chang, W Bao, Q Hou, C Jiang, N Zheng, Y Zhong, X Zhang, Z Song, ... arXiv preprint arXiv:2406.06858, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–8

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors