FA2: Fast, accurate autoscaling for serving deep learning inference with SLA guarantees K Razavi, M Luthra, B Koldehofe, M Mühlhäuser, L Wang 2022 IEEE 28th Real-Time and Embedded Technology and Applications Symposium …, 2022 | 16* | 2022 |
Operator as a service: Stateful serverless complex event processing M Luthra, S Hennig, K Razavi, L Wang, B Koldehofe 2020 IEEE International Conference on Big Data (Big Data), 1964-1973, 2020 | 8 | 2020 |
Reconciling high accuracy, cost-efficiency, and low latency of inference serving systems M Salmani, S Ghafouri, A Sanaee, K Razavi, M Mühlhäuser, J Doyle, ... Proceedings of the 3rd Workshop on Machine Learning and Systems, 78-86, 2023 | 7 | 2023 |
Distributed DNN serving in the network data plane K Razavi, G Karlos, V Nigade, M Mühlhäuser, L Wang Proceedings of the 5th International Workshop on P4 in Europe, 67-70, 2022 | 6 | 2022 |
[Solution] IPA: Inference Pipeline Adaptation to achieve high accuracy and cost-efficiency S Ghafouri, K Razavi, M Salmani, A Sanaee, TL Botran, L Wang, J Doyle, ... Journal of Systems Research 4 (1), 2024 | 1* | 2024 |
Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling K Razavi, S Ghafouri, M Mühlhäuser, P Jamshidi, L Wang Proceedings of the 4th Workshop on Machine Learning and Systems, 184-191, 2024 | | 2024 |