Mikhail Smelyanskiy

引用次数

	总计	2019 年至今
引用	11846	8061
h 指数	41	33
i10 指数	92	68

2000

1000

500

1500

200920102011201220132014201520162017201820192020202120222023202451 90 178 221 278 434 474 473 545 722 990 1316 1528 1764 1913 542

开放获取的出版物数量

查看全部

13 篇文章

1 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

关注

Mikhail Smelyanskiy

Facebook

在 intel.com 的电子邮件经过验证 - 首页

Deep learning HPC SW/HW co-design


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
On large-batch training for deep learning: Generalization gap and sharp minima NS Keskar, D Mudigere, J Nocedal, M Smelyanskiy, PTP Tang arXiv preprint arXiv:1609.04836, 2016	3300	2016
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU VW Lee, C Kim, J Chhugani, M Deisher, D Kim, AD Nguyen, N Satish, ... Proceedings of the 37th annual international symposium on Computer …, 2010	1197	2010
Applied machine learning at facebook: A datacenter infrastructure perspective K Hazelwood, S Bird, D Brooks, S Chintala, U Diril, D Dzhulgakov, ... 2018 IEEE International Symposium on High Performance Computer Architecture …, 2018	695	2018
Deep learning recommendation model for personalization and recommendation systems M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ... arXiv preprint arXiv:1906.00091, 2019	621	2019
Efficient sparse matrix-vector multiplication on x86-based many-core processors X Liu, M Smelyanskiy, E Chow, P Dubey Proceedings of the 27th international ACM conference on International …, 2013	329	2013
Glow: Graph lowering compiler techniques for neural networks N Rotem, J Fix, S Abdulrasool, G Catron, S Deng, R Dzhabarov, N Gibson, ... arXiv preprint arXiv:1805.00907, 2018	310	2018
A study of BFLOAT16 for deep learning training D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ... arXiv preprint arXiv:1905.12322, 2019	296	2019
The architectural implications of facebook's dnn-based personalized recommendation U Gupta, CJ Wu, X Wang, M Naumov, B Reagen, D Brooks, B Cottel, ... 2020 IEEE International Symposium on High Performance Computer Architecture …, 2020	272	2020
Design and implementation of the linpack benchmark for single and multi-node systems based on intel® xeon phi coprocessor A Heinecke, K Vaidyanathan, M Smelyanskiy, A Kobotov, R Dubtsov, ... 2013 IEEE 27th International Symposium on Parallel and Distributed …, 2013	215	2013
Exploring simd for molecular dynamics, using intel® xeon® processors and intel® xeon phi coprocessors SJ Pennycook, CJ Hughes, M Smelyanskiy, SA Jarvis 2013 IEEE 27th International symposium on parallel and distributed …, 2013	211	2013
Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ... arXiv preprint arXiv:1811.09886, 2018	199	2018
Recnmp: Accelerating personalized recommendation with near-memory processing L Ke, U Gupta, BY Cho, D Brooks, V Chandra, U Diril, A Firoozshahian, ... 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture …, 2020	188	2020
Petascale high order dynamic rupture earthquake simulations on heterogeneous supercomputers A Heinecke, A Breuer, S Rettenberger, M Bader, AA Gabriel, C Pelties, ... SC'14: Proceedings of the International Conference for High Performance …, 2014	167	2014
Anatomy of high-performance many-threaded matrix multiplication TM Smith, R Van De Geijn, M Smelyanskiy, JR Hammond, FG Van Zee 2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014	163	2014
qHiPSTER: The quantum high performance software testing environment M Smelyanskiy, NPD Sawaya, A Aspuru-Guzik arXiv preprint arXiv:1601.07195, 2016	152	2016
Convergence of recognition, mining, and synthesis workloads and its implications YK Chen, J Chhugani, P Dubey, CJ Hughes, D Kim, S Kumar, VW Lee, ... Proceedings of the IEEE 96 (5), 790-807, 2008	149	2008
Practical optimization for hybrid quantum-classical algorithms GG Guerreschi, M Smelyanskiy arXiv preprint arXiv:1701.01450, 2017	148	2017
Can traditional programming bridge the ninja performance gap for parallel computing applications? N Satish, C Kim, J Chhugani, H Saito, R Krishnaiyer, M Smelyanskiy, ... ACM SIGARCH Computer Architecture News 40 (3), 440-451, 2012	146	2012
The BLIS framework: Experiments in portability FG Van Zee, TM Smith, B Marker, TM Low, RAVD Geijn, FD Igual, ... ACM Transactions on Mathematical Software (TOMS) 42 (2), 1-19, 2016	126	2016
Mapping high-fidelity volume rendering for medical imaging to CPU, GPU and many-core architectures M Smelyanskiy, D Holmes, J Chhugani, A Larson, DM Carmean, ... IEEE transactions on visualization and computer graphics 15 (6), 1563-1570, 2009	112	2009

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

关注

引用次数