Deepspeed-chat: Easy, fast and affordable rlhf training of chatgpt-like models at all scales Z Yao, RY Aminabadi, O Ruwase, S Rajbhandari, X Wu, AA Awan, ... arXiv preprint arXiv:2308.01320, 2023 | 53 | 2023 |
Flash-llm: Enabling cost-effective and highly-efficient large generative model inference with unstructured sparsity H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song arXiv preprint arXiv:2309.10285, 2023 | 39 | 2023 |
Renaissance: A survey into ai text-to-image generation in the era of large model F Bie, Y Yang, Z Zhou, A Ghanem, M Zhang, Z Yao, X Wu, C Holmes, ... arXiv preprint arXiv:2309.00810, 2023 | 19 | 2023 |
Fp6-llm: Efficiently serving large language models through fp6-centric algorithm-system co-design H Xia, Z Zheng, X Wu, S Chen, Z Yao, S Youn, A Bakhtiari, M Wyatt, ... arXiv preprint arXiv:2401.14112, 2024 | 11 | 2024 |
DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies SL Song, B Kruft, M Zhang, C Li, S Chen, C Zhang, M Tanaka, X Wu, ... arXiv preprint arXiv:2310.04610, 2023 | 6 | 2023 |
Binary neural network for automated visual surface defect detection W Liu, J Zhang, Z Su, Z Zhou, L Liu Sensors 21 (20), 6868, 2021 | 6 | 2021 |
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models Y Yang, X Li, Z Zhou, SL Song, J Wu, L Nie, B Ghanem arXiv preprint arXiv:2406.05223, 2024 | 3 | 2024 |
{Quant-LLM}: Accelerating the Serving of Large Language Models via {FP6-Centric}{Algorithm-System}{Co-Design} on Modern {GPUs} H Xia, Z Zheng, X Wu, S Chen, Z Yao, S Youn, A Bakhtiari, M Wyatt, ... 2024 USENIX Annual Technical Conference (USENIX ATC 24), 699-713, 2024 | 3 | 2024 |
JSidentify: A hybrid framework for detecting plagiarism among JavaScript code in online mini games Q Xia, Z Zhou, Z Li, B Xu, W Zou, Z Chen, H Ma, G Liang, H Lu, S Guo, ... Proceedings of the ACM/IEEE 42nd International Conference on Software …, 2020 | 3 | 2020 |
Flash-LLM: Enabling Cost-E ective and Highly-E icient Large Generative Model Inference with Unstructured Sparsity H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song | | |