关注
Marat Dukhan
标题
引用次数
引用次数
年份
Machine learning at facebook: Understanding inference at the edge
CJ Wu, D Brooks, K Chen, D Chen, S Choudhury, M Dukhan, ...
2019 IEEE international symposium on high performance computer architecture …, 2019
5102019
Chamnet: Towards efficient network design through platform-aware model adaptation
X Dai, P Zhang, B Wu, H Yin, F Sun, Y Wang, M Dukhan, Y Hu, Y Wu, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019
3022019
Fast sparse convnets
E Elsen, M Dukhan, T Gale, K Simonyan
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020
1512020
Algorithmic time, energy, and power on candidate HPC compute building blocks
J Choi, M Dukhan, X Liu, R Vuduc
2014 IEEE 28th international parallel and distributed processing symposium …, 2014
1012014
The indirect convolution algorithm
M Dukhan
arXiv preprint arXiv:1907.02129, 2019
482019
QNNPACK: Open source library for optimized mobile deep learning
M Dukhan, Y Wu, H Lu
382018
NNPACK: Acceleration package for neural networks on multi-core CPUs
M Dukhan
29*2016
Branch-avoiding graph algorithms
O Green, M Dukhan, R Vuduc
Proceedings of the 27th ACM symposium on Parallelism in Algorithms and …, 2015
292015
Scaling up hartree–fock calculations on tianhe-2
E Chow, X Liu, S Misra, M Dukhan, M Smelyanskiy, JR Hammond, Y Du, ...
The International Journal of High Performance Computing Applications 30 (1 …, 2016
272016
Methods for high-throughput computation of elementary functions
M Dukhan, R Vuduc
Parallel Processing and Applied Mathematics: 10th International Conference …, 2014
122014
Optimizing the computation of n-point correlations on large-scale astronomical data
WB March, K Czechowski, M Dukhan, T Benson, D Lee, AJ Connolly, ...
SC'12: Proceedings of the International Conference on High Performance …, 2012
122012
Two-pass softmax algorithm
M Dukhan, A Ablavatski
2020 IEEE International Parallel and Distributed Processing Symposium …, 2020
102020
PeachPy: A python framework for developing high-performance assembly kernels
M Dukhan
Python for High Performance and Scientific Computing, 2013
82013
Indirect deconvolution algorithm
M Dukhan
2020 IEEE International Parallel and Distributed Processing Symposium …, 2020
32020
Wanted: Floating-point add round-off error instruction
M Dukhan, R Vuduc, J Riedy
arXiv preprint arXiv:1603.00491, 2016
32016
PeachPy meets Opcodes: direct machine code generation from Python
M Dukhan
Proceedings of the 5th Workshop on Python for High-Performance and …, 2015
22015
Fast Sparse Neural Networks
EK Elsen, TJ Gale, M Dukhan
US Patent US-20220335272-A1, 2022
12022
What a fast FPU means for algorithms: A story of vector elementary functions
M Dukhan
2013 IEEE Hot Chips 25 Symposium (HCS), 1-1, 2013
2013
Furious. js: a Model for Offloading Compute-Intensive JavaScript Applications
M Dukhan, R Taylor, R Guthrie, R Vuduc
系统目前无法执行此操作,请稍后再试。
文章 1–19