Spatio-temporal graph dual-attention network for multi-agent prediction and tracking J Li, H Ma, Z Zhang, J Li, M Tomizuka IEEE Transactions on Intelligent Transportation Systems 23 (8), 10556-10569, 2021 | 72 | 2021 |
Social-wagdat: Interaction-aware trajectory prediction via wasserstein graph double-attention network J Li, H Ma, Z Zhang, M Tomizuka arXiv preprint arXiv:2002.06241, 2020 | 70 | 2020 |
Specinfer: Accelerating generative llm serving with speculative inference and token tree verification X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, RYY Wong, Z Chen, ... arXiv preprint arXiv:2305.09781 1 (2), 4, 2023 | 50 | 2023 |
GradSign: Model Performance Inference with Theoretical Insights Z Zhang, Z Jia arXiv preprint arXiv:2110.08616, 2021 | 21 | 2021 |
Towards efficient generative large language model serving: A survey from algorithms to systems X Miao, G Oliaro, Z Zhang, X Cheng, H Jin, T Chen, Z Jia arXiv preprint arXiv:2312.15234, 2023 | 20 | 2023 |
Accelerating retrieval-augmented language model serving with speculation Z Zhang, A Zhu, L Yang, Y Xu, L Li, PM Phothilimthana, Z Jia arXiv preprint arXiv:2401.14021, 2024 | 2 | 2024 |