Follow
Qizhen Weng
Title
Cited by
Cited by
Year
MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters
Q Weng, W Xiao, Y Yu, W Wang, C Wang, J He, Y Li, L Zhang, W Lin, ...
19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022
1612022
Metis: Learning to schedule long-running applications in shared container clusters at scale
L Wang, Q Weng, W Wang, C Chen, B Li
SC20: International Conference for High Performance Computing, Networking …, 2020
402020
Fast distributed deep learning via worker-adaptive batch sizing
C Chen, Q Weng, W Wang, B Li, B Li
Proceedings of the ACM symposium on cloud computing, 521-521, 2018
282018
Semi-dynamic load balancing: Efficient distributed learning in non-dedicated environments
C Chen, Q Weng, W Wang, B Li, B Li
Proceedings of the 11th ACM Symposium on Cloud Computing, 431-446, 2020
212020
Opus: Fair and efficient cache sharing for in-memory data analytics
Y Yu, W Wang, J Zhang, Q Weng, KB Letaief
2018 IEEE 38th International Conference on Distributed Computing Systems …, 2018
142018
Beware of Fragmentation: Scheduling GPU-Sharing Workloads with Fragmentation Gradient Descent
Q Weng, L Yang, Y Yu, W Wang, X Tang, G Yang, L Zhang
2023 USENIX Annual Technical Conference (USENIX ATC 23), 995-1008, 2023
72023
Workload consolidation in alibaba clusters: the good, the bad, and the ugly
Y Zhang, Y Yu, W Wang, Q Chen, J Wu, Z Zhang, J Zhong, T Ding, ...
Proceedings of the 13th Symposium on Cloud Computing, 210-225, 2022
62022
Accelerating distributed learning in non-dedicated environments
C Chen, Q Weng, W Wang, B Li, B Li
IEEE Transactions on Cloud Computing 11 (1), 515-531, 2021
62021
Internlm2 technical report
Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...
arXiv preprint arXiv:2403.17297, 2024
42024
Towards framework-independent, non-intrusive performance characterization for dataflow computation
H Tian, Q Weng, W Wang
Proceedings of the 10th ACM SIGOPS Asia-Pacific Workshop on Systems, 54-60, 2019
32019
CaraServe: CPU-Assisted and Rank-Aware LoRA Serving for Generative LLM Inference
S Li, H Lu, T Wu, M Yu, Q Weng, X Chen, Y Shan, B Yuan, W Wang
arXiv preprint arXiv:2401.11240, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–11