Accurate, large minibatch SG D: training imagenet in 1 hour P Goyal arXiv preprint arXiv:1706.02677, 2017 | 4217 | 2017 |
Machine learning at facebook: Understanding inference at the edge CJ Wu, D Brooks, K Chen, D Chen, S Choudhury, M Dukhan, ... 2019 IEEE international symposium on high performance computer architecture …, 2019 | 572 | 2019 |
Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ... arXiv preprint arXiv:1811.09886, 2018 | 228 | 2018 |
Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv 2017 P Goyal, P Dollár, R Girshick, P Noordhuis, L Wesolowski, A Kyrola, ... arXiv preprint arXiv:1706.02677, 2019 | 152 | 2019 |
Software-hardware co-design for fast and scalable training of deep learning recommendation models D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 115 | 2022 |
M. khorashadi, P D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... Bhattacharya, P. Lapukhov, M. Naumov, L. Qiao, M. Smelyanskiy, B. Jia, and V …, 2021 | 48 | 2021 |
Allocating information for content selection among computing resources of an online system U Pashkevich, AJ Tulloch, D Dzhulgakov, LS Backstrom US Patent 10,083,465, 2018 | 43 | 2018 |
High performance ultra-low-precision convolutions on mobile devices A Tulloch, Y Jia arXiv preprint arXiv:1712.02427, 2017 | 38 | 2017 |
High-performance, distributed training of large-scale deep learning recommendation models D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ... arXiv preprint arXiv:2104.05158, 2021 | 35 | 2021 |
Mixed-precision embedding using a cache JA Yang, J Huang, J Park, PTP Tang, A Tulloch arXiv preprint arXiv:2010.11305, 2020 | 31 | 2020 |
On periodic functions as regularizers for quantization of neural networks M Naumov, U Diril, J Park, B Ray, J Jablonski, A Tulloch arXiv preprint arXiv:1811.09862, 2018 | 29 | 2018 |
Accurate low-rank approximations via a few iterations of alternating least squares A Szlam, A Tulloch, M Tygert SIAM Journal on Matrix Analysis and Applications 38 (2), 425-433, 2017 | 14 | 2017 |
Jie (Amy) Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ... | 12 | 2022 |
Systems and methods for quantizing neural networks via periodic regularization functions M Naumov, AU Diril, JS Park, B Ray, J Jablonski, AJ Tulloch US Patent 11,468,313, 2022 | 11 | 2022 |
Accurate, large minibatch SGD: training imagenet in 1 hour. CoRR abs/1706.02677 P Goyal, P Dollár, RB Girshick, P Noordhuis, L Wesolowski, A Kyrola, ... arXiv preprint arXiv:1706.02677, 2017 | 11 | 2017 |
Dynamically allocating computing resources to identify advertisements for presentation AJ Tulloch, SM Bowers, JIQ Candela US Patent 10,438,235, 2019 | 9 | 2019 |
Hybrid composition with idleblock: more efficient networks for image recognition B Xu, A Tulloch, Y Chen, X Yang, L Qiao arXiv preprint arXiv:1911.08609, 2019 | 5 | 2019 |
MTrainS: Improving DLRM training efficiency using heterogeneous memories HT Kassa, P Johnson, J Akers, M Ghosh, A Tulloch, D Mudigere, J Park, ... arXiv preprint arXiv:2305.01515, 2023 | 2 | 2023 |
MTrainS: Improving DLRM training efficiency using heterogeneous memories H Tadese Kassa, P Johnson, J Akers, M Ghosh, A Tulloch, D Mudigere, ... arXiv e-prints, arXiv: 2305.01515, 2023 | | 2023 |
Dynamically allocating computing resources to identify advertisements for presentation AJ Tulloch, SM Bowers, JIQ Candela US Patent 11,386,451, 2022 | | 2022 |