Follow
Ningxin Zheng
Ningxin Zheng
Microsoft Research Aisa
Verified email at microsoft.com
Title
Cited by
Cited by
Year
Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices
LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Yang, Y Liu
Proceedings of the 19th Annual International Conference on Mobile Systems …, 2021
842021
Efficientvit: Memory efficient vision transformer with cascaded group attention
X Liu, H Peng, N Zheng, Y Yang, H Hu, Y Yuan
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
792023
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction
W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ...
Proceedings of the International Conference for High Performance Computing …, 2021
402021
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute}
N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
242022
Toward qos-awareness and improved utilization of spatial multitasking gpus
W Zhang, Q Chen, N Zheng, W Cui, K Fu, M Guo
IEEE Transactions on Computers 71 (4), 866-879, 2021
162021
URSA: Precise capacity planning and fair scheduling based on low-level statistics for public clouds
W Zhang, N Zheng, Q Chen, Y Yang, Z Song, T Ma, J Leng, M Guo
Proceedings of the 49th International Conference on Parallel Processing, 1-11, 2020
162020
Online video super-resolution with convolutional kernel bypass grafts
J Xiao, X Jiang, N Zheng, H Yang, Y Yang, Y Yang, D Li, KM Lam
IEEE Transactions on Multimedia, 2023
132023
Astraea: towards QoS-aware and resource-efficient multi-stage GPU services
W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, M Guo
Proceedings of the 27th ACM International Conference on Architectural …, 2022
132022
Online video streaming super-resolution with adaptive look-up table fusion
G Yin, X Jiang, S Jiang, Z Han, N Zheng, H Yang, D Bai, H Tan, S Sun, ...
arXiv e-prints, arXiv: 2303.00334, 2023
72023
Optimizing dynamic neural networks with brainstorm
W Cui, Z Han, L Ouyang, Y Wang, N Zheng, L Ma, Y Yang, F Yang, J Xue, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
72023
QoS-aware irregular collaborative inference for improving throughput of DNN services
K Fu, J Shi, Q Chen, N Zheng, W Zhang, D Zeng, M Guo
SC22: International Conference for High Performance Computing, Networking …, 2022
72022
Charm: Collaborative host and accelerator resource management for gpu datacenters
W Zhang, K Fu, N Zheng, Q Chen, C Li, W Zheng, M Guo
2021 IEEE 39th International Conference on Computer Design (ICCD), 307-315, 2021
72021
Full-cycle energy consumption benchmark for low-carbon computer vision
B Li, X Jiang, D Bai, Y Zhang, N Zheng, X Dong, L Liu, Y Yang, D Li
arXiv preprint arXiv:2108.13465, 2021
72021
Pit: Optimization of dynamic sparse deep learning models via permutation invariant transformation
N Zheng, H Jiang, Q Zhang, Z Han, L Ma, Y Yang, F Yang, C Zhang, L Qiu, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 331-347, 2023
42023
nn-METER: Towards accurate latency prediction of DNN inference on diverse edge devices
LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Liu
GetMobile: Mobile Computing and Communications 25 (4), 19-23, 2022
42022
Poster: Precise capacity planning for database public clouds
N Zheng, Q Chen, Y Yang, J Li, W Zheng, M Guo
2019 28th International Conference on Parallel Architectures and Compilation …, 2019
42019
Towards QoS-aware and resource-efficient GPU microservices based on spatial multitasking GPUs in datacenters
W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, C Li, W Zheng, ...
arXiv preprint arXiv:2005.02088, 2020
32020
Efficient gpu kernels for n: m-sparse weights in deep learning
B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ...
Proceedings of Machine Learning and Systems 5, 2023
22023
Online Streaming Video Super-Resolution with Convolutional Look-Up Table
G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, X Liu, H Yang, Y Yang, ...
arXiv preprint arXiv:2303.00334, 2023
12023
Online Streaming Video Super-Resolution With Convolutional Look-Up Table
G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, H Yang, X Liu, Y Yang, ...
IEEE Transactions on Image Processing 33, 2305-2317, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20