Bloom: A 176b-parameter open-access multilingual language model T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ... | 1150 | 2023 |
Starcoder: may the source be with you! R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ... arXiv preprint arXiv:2305.06161, 2023 | 287 | 2023 |
Natural language processing with transformers L Tunstall, L Von Werra, T Wolf " O'Reilly Media, Inc.", 2022 | 250 | 2022 |
Zephyr: Direct distillation of lm alignment L Tunstall, E Beeching, N Lambert, N Rajani, K Rasul, Y Belkada, ... arXiv preprint arXiv:2310.16944, 2023 | 139 | 2023 |
The stack: 3 tb of permissively licensed source code D Kocetkov, R Li, LB Allal, J Li, C Mou, CM Ferrandis, Y Jernite, M Mitchell, ... arXiv preprint arXiv:2211.15533, 2022 | 126 | 2022 |
SantaCoder: don't reach for the stars! LB Allal, R Li, D Kocetkov, C Mou, C Akiki, CM Ferrandis, N Muennighoff, ... arXiv preprint arXiv:2301.03988, 2023 | 115 | 2023 |
The bigscience roots corpus: A 1.6 tb composite multilingual dataset H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ... Advances in Neural Information Processing Systems 35, 31809-31826, 2022 | 102 | 2022 |
Illustrating reinforcement learning from human feedback (rlhf) N Lambert, L Castricato, L von Werra, A Havrilla Hugging Face Blog 9, 2022 | 70 | 2022 |
Octopack: Instruction tuning code large language models N Muennighoff, Q Liu, A Zebaze, Q Zheng, B Hui, TY Zhuo, S Singh, ... arXiv preprint arXiv:2308.07124, 2023 | 55 | 2023 |
Trl: Transformer reinforcement learning L von Werra, Y Belkada, L Tunstall, E Beeching, T Thrush, N Lambert, ... GitHub. Available online at: https://github. com/lvwerra/trl, 2020 | 55 | 2020 |
Design and performance of two orthogonal extraction time-of-flight secondary ion mass spectrometers for focused ion beam instruments D Alberts, L von Werra, F Oestlund, U Rohner, M Hohl, J Michler, ... Instrumentation Science & Technology 42 (4), 432-445, 2014 | 38 | 2014 |
Evaluate & evaluation on the hub: Better best practices for data and model measurements L Von Werra, L Tunstall, A Thakur, S Luccioni, T Thrush, A Piktus, F Marty, ... Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022 | 13 | 2022 |
Loubna Ben allal H Laurençon, L Saulnier, T Wang, C Akiki, AV del Moral, T Le Scao, ... | 10 | 2022 |
StarCoder 2 and The Stack v2: The Next Generation A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ... arXiv preprint arXiv:2402.19173, 2024 | 9 | 2024 |
Unsupervised anomaly detection for seasonal time series L von Werra, L Tunstall, S Hofer 2019 6th Swiss Conference on Data Science (SDS), 136-137, 2019 | 8 | 2019 |
Radiometric characterization of a water-based conical blackbody calibration target for millimeter-wave remote sensing K Jacob, A Schröder, L von Werra, F Reinhard, P Raisin, A Murk IEEE Journal of Selected Topics in Applied Earth Observations and Remote …, 2019 | 8 | 2019 |
Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models TY Zhuo, A Zebaze, N Suppattarachai, L von Werra, H de Vries, Q Liu, ... arXiv preprint arXiv:2401.00788, 2024 | 7 | 2024 |
Generative adversarial networks in precision oncology L von Werra, M Schöngens, ED Gamsiz Uzun, C Eickhoff Proceedings of the 2019 ACM SIGIR International Conference on Theory of …, 2019 | 4 | 2019 |
Natural Language Processing mit Transformern: Sprachanwendungen mit Hugging Face erstellen L Tunstall, L von Werra, T Wolf o'Reilly, 2023 | 2 | 2023 |
A water-based conical blackbody concept for millimeter-wave remote sensing A Schröder, A Murk, L von Werra, F Reinhard, P Raisin, K Jacob 2016 41st International Conference on Infrared, Millimeter, and Terahertz …, 2016 | 2 | 2016 |