Follow
Ying Sheng
Title
Cited by
Cited by
Year
Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality
WL Chiang, Z Li, Z Lin, Y Sheng, Z Wu, H Zhang, L Zheng, S Zhuang, ...
See https://vicuna. lmsys. org (accessed 14 April 2023) 2 (3), 6, 2023
1536*2023
Judging llm-as-a-judge with mt-bench and chatbot arena
L Zheng, WL Chiang, Y Sheng, S Zhuang, Z Wu, Y Zhuang, Z Lin, Z Li, ...
Advances in Neural Information Processing Systems 36, 2024
1443*2024
Efficient memory management for large language model serving with pagedattention
W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng, CH Yu, J Gonzalez, H Zhang, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 611-626, 2023
4972023
cvc5: A versatile and industrial-strength SMT solver
H Barbosa, C Barrett, M Brain, G Kremer, H Lachnitt, M Mann, ...
International Conference on Tools and Algorithms for the Construction and …, 2022
3822022
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Re, ...
International Conference on Machine Learning, 2023
1902023
H2o: Heavy-hitter oracle for efficient generative inference of large language models
Z Zhang, Y Sheng, T Zhou, T Chen, L Zheng, R Cai, Z Song, Y Tian, C Ré, ...
Advances in Neural Information Processing Systems 36, 2024
962024
How Long Can Context Length of Open-Source LLMs truly Promise?
D Li, R Shao, A Xie, Y Sheng, L Zheng, J Gonzalez, I Stoica, X Ma, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
80*2023
{AlpaServe}: Statistical multiplexing with model parallelism for deep learning serving
Z Li, L Zheng, Y Zhong, V Liu, Y Sheng, X Jin, Y Huang, Z Chen, H Zhang, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
802023
Chatbot arena: An open platform for evaluating llms by human preference
WL Chiang, L Zheng, Y Sheng, AN Angelopoulos, T Li, D Li, H Zhang, ...
arXiv preprint arXiv:2403.04132, 2024
722024
Lmsys-chat-1m: A large-scale real-world llm conversation dataset
L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang, Z Wu, Y Zhuang, Z Li, ...
arXiv preprint arXiv:2309.11998, 2023
492023
SLoRA: Scalable Serving of Thousands of LoRA Adapters
Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ...
Proceedings of Machine Learning and Systems 6, 296-311, 2024
39*2024
Subspace embedding and linear regression with orlicz norm
A Andoni, C Lin, Y Sheng, P Zhong, R Zhong
International Conference on Machine Learning, 224-233, 2018
382018
Distribution-free junta testing
X Chen, Z Liu, RA Servedio, Y Sheng, J Xie
STOC 2018, 2018
30*2018
Efficiently programming large language models using sglang
L Zheng, L Yin, Z Xie, J Huang, C Sun, CH Yu, S Cao, C Kozyrakis, ...
arXiv preprint arXiv:2312.07104, 2023
292023
Towards Optimal Caching and Model Selection for Large Model Inference
B Zhu, Y Sheng, L Zheng, C Barrett, M Jordan, J Jiao
Advances in Neural Information Processing Systems 36, 2024
12*2024
Clover: Closed-loop verifiable code generation
C Sun, Y Sheng, O Padon, C Barrett
AI Verification: First International Symposium, SAIV 2024, Montreal, QC …, 2024
122024
Fairness in serving large language models
Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo, JE Gonzalez, I Stoica
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
112024
Politeness for the theory of algebraic datatypes
Y Sheng, Y Zohar, C Ringeissen, J Lange, P Fontaine, C Barrett
International Joint Conference on Automated Reasoning, 238-255, 2020
11*2020
On the approximation of Nash equilibria in sparse win-lose games
Z Liu, Y Sheng
Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018
92018
Reasoning about vectors using an SMT theory of sequences
Y Sheng, A Nötzli, A Reynolds, Y Zohar, D Dill, W Grieskamp, J Park, ...
International Joint Conference on Automated Reasoning, 125-143, 2022
8*2022
The system can't perform the operation now. Try again later.
Articles 1–20