Megatron-lm: Training multi-billion parameter language models using model parallelism M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro arXiv preprint arXiv:1909.08053, 2019 | 487 | 2019 |
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021 | 173 | 2021 |
Training question answering models from synthetic data R Puri, R Spring, M Patwary, M Shoeybi, B Catanzaro arXiv preprint arXiv:2002.09599, 2020 | 60 | 2020 |
MEGATRON-CNTRL: Controllable story generation with external knowledge using large-scale language models P Xu, M Patwary, M Shoeybi, R Puri, P Fung, A Anandkumar, B Catanzaro arXiv preprint arXiv:2010.00840, 2020 | 49 | 2020 |
Practical text classification with large pre-trained language models N Kant, R Puri, N Yakovenko, B Catanzaro arXiv preprint arXiv:1812.01207, 2018 | 39 | 2018 |
Zero-shot text classification with generative language models R Puri, B Catanzaro arXiv preprint arXiv:1912.10165, 2019 | 34 | 2019 |
BioMegatron: Larger biomedical domain language model HC Shin, Y Zhang, E Bakhturina, R Puri, M Patwary, M Shoeybi, R Mani arXiv preprint arXiv:2010.06060, 2020 | 30 | 2020 |
Large scale language modeling: Converging on 40gb of text in four hours R Puri, R Kirby, N Yakovenko, B Catanzaro 2018 30th International Symposium on Computer Architecture and High …, 2018 | 20 | 2018 |
Training multi-billion parameter language models using model parallelism M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, BMLM Catanzaro arXiv preprint cs.CL/1909.08053, 2019 | 19 | 2019 |
Text and code embeddings by contrastive pre-training A Neelakantan, T Xu, R Puri, A Radford, JM Han, J Tworek, Q Yuan, ... arXiv preprint arXiv:2201.10005, 2022 | 16 | 2022 |
Large scale multi-actor generative dialog modeling A Boyd, R Puri, M Shoeybi, M Patwary, B Catanzaro arXiv preprint arXiv:2005.06114, 2020 | 16 | 2020 |
Evaluating large language models trained on code.(2021) M Chen, J Tworek, H Jun, Q Yuan, HP de Oliveira Pinto, J Kaplan, ... arXiv preprint arXiv:2107.03374, 2021 | 9 | 2021 |
Training multi-billion parameter language models using model parallelism. arxiv 2019 M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, BMLM Catanzaro arXiv preprint arXiv:1909.08053, 0 | 9 | |
Transferability of adversarial attacks in model-agnostic meta-learning R Edmunds, N Golmant, V Ramasesh, P Kuznetsov, P Patil, R Puri Deep Learning and Security Workshop (DLSW) in Singapore, 2017 | 7 | 2017 |
Few shot learning for point cloud data using model agnostic meta learning R Puri, A Zakhor, R Puri 2020 IEEE International Conference on Image Processing (ICIP), 1906-1910, 2020 | 3 | 2020 |
Local knowledge powered conversational agents S Santhanam, W Ping, R Puri, M Shoeybi, M Patwary, B Catanzaro arXiv preprint arXiv:2010.10150, 2020 | 3 | 2020 |
Model Agnostic Contrastive Explanations for Structured Data. CoRR abs/1906.00117 (2019) A Dhurandhar, T Pedapati, A Balakrishnan, P Chen, K Shanmugam, ... | 3 | 1906 |
Frame rate upscaling with deep neural networks T Xiao, R Puri, G Kesineni Term Paper for CS294-129 Deep Neural Networks, Fall, 2016 | 2 | 2016 |
Adversarial Machine Learning P Kuznetsov, R Edmunds, T Xiao, H Iqbal, R Puri, N Golmant, S Shih Artificial Intelligence Safety and Security, 235-248, 2018 | 1 | 2018 |
Kontradyktoryjne uczenie maszynowe P Kuznetsov, R Edmunds, T Xiao, H Iqbal, R Puri, N Golmant, S Shih Napędy i Sterowanie 23, 2021 | | 2021 |