Ningxin Zheng

Cited by

	All	Since 2019
Citations	338	338
h-index	8	8
i10-index	8	8

180

135

202020212022202320245 12 72 166 83

Public access

View all

8 articles

3 articles

available

not available

Based on funding mandates

Co-authors

Yuqing YangMicrosoftVerified email at microsoft.com
Quan ChenProfessor, Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Minyi GuoIEEE Fellow, Chair Professor, Shanghai Jiao Tong UniversityVerified email at cs.sjtu.edu.cn
Weihao CuiShanghai Jiao Tong UniversityVerified email at sjtu.edu.cn

Ningxin Zheng

Microsoft Research Aisa

Verified email at microsoft.com


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Yang, Y Liu Proceedings of the 19th Annual International Conference on Mobile Systems …, 2021	84	2021
Efficientvit: Memory efficient vision transformer with cascaded group attention X Liu, H Peng, N Zheng, Y Yang, H Hu, Y Yuan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	79	2023
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ... Proceedings of the International Conference for High Performance Computing …, 2021	40	2021
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute} N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022	24	2022
Toward qos-awareness and improved utilization of spatial multitasking gpus W Zhang, Q Chen, N Zheng, W Cui, K Fu, M Guo IEEE Transactions on Computers 71 (4), 866-879, 2021	16	2021
URSA: Precise capacity planning and fair scheduling based on low-level statistics for public clouds W Zhang, N Zheng, Q Chen, Y Yang, Z Song, T Ma, J Leng, M Guo Proceedings of the 49th International Conference on Parallel Processing, 1-11, 2020	16	2020
Online video super-resolution with convolutional kernel bypass grafts J Xiao, X Jiang, N Zheng, H Yang, Y Yang, Y Yang, D Li, KM Lam IEEE Transactions on Multimedia, 2023	13	2023
Astraea: towards QoS-aware and resource-efficient multi-stage GPU services W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, M Guo Proceedings of the 27th ACM International Conference on Architectural …, 2022	13	2022
Online video streaming super-resolution with adaptive look-up table fusion G Yin, X Jiang, S Jiang, Z Han, N Zheng, H Yang, D Bai, H Tan, S Sun, ... arXiv e-prints, arXiv: 2303.00334, 2023	7	2023
Optimizing dynamic neural networks with brainstorm W Cui, Z Han, L Ouyang, Y Wang, N Zheng, L Ma, Y Yang, F Yang, J Xue, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023	7	2023
QoS-aware irregular collaborative inference for improving throughput of DNN services K Fu, J Shi, Q Chen, N Zheng, W Zhang, D Zeng, M Guo SC22: International Conference for High Performance Computing, Networking …, 2022	7	2022
Charm: Collaborative host and accelerator resource management for gpu datacenters W Zhang, K Fu, N Zheng, Q Chen, C Li, W Zheng, M Guo 2021 IEEE 39th International Conference on Computer Design (ICCD), 307-315, 2021	7	2021
Full-cycle energy consumption benchmark for low-carbon computer vision B Li, X Jiang, D Bai, Y Zhang, N Zheng, X Dong, L Liu, Y Yang, D Li arXiv preprint arXiv:2108.13465, 2021	7	2021
Pit: Optimization of dynamic sparse deep learning models via permutation invariant transformation N Zheng, H Jiang, Q Zhang, Z Han, L Ma, Y Yang, F Yang, C Zhang, L Qiu, ... Proceedings of the 29th Symposium on Operating Systems Principles, 331-347, 2023	4	2023
nn-METER: Towards accurate latency prediction of DNN inference on diverse edge devices LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Liu GetMobile: Mobile Computing and Communications 25 (4), 19-23, 2022	4	2022
Poster: Precise capacity planning for database public clouds N Zheng, Q Chen, Y Yang, J Li, W Zheng, M Guo 2019 28th International Conference on Parallel Architectures and Compilation …, 2019	4	2019
Towards QoS-aware and resource-efficient GPU microservices based on spatial multitasking GPUs in datacenters W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, C Li, W Zheng, ... arXiv preprint arXiv:2005.02088, 2020	3	2020
Efficient gpu kernels for n: m-sparse weights in deep learning B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ... Proceedings of Machine Learning and Systems 5, 2023	2	2023
Online Streaming Video Super-Resolution with Convolutional Look-Up Table G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, X Liu, H Yang, Y Yang, ... arXiv preprint arXiv:2303.00334, 2023	1	2023
Online Streaming Video Super-Resolution With Convolutional Look-Up Table G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, H Yang, X Liu, Y Yang, ... IEEE Transactions on Image Processing 33, 2305-2317, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors