Follow
Yacine Jernite
Yacine Jernite
Research Scientist, HuggingFace
Verified email at cs.nyu.edu - Homepage
Title
Cited by
Cited by
Year
Transformers: State-of-the-art natural language processing
T Wolf, L Debut, V Sanh, J Chaumond, C Delangue, A Moi, P Cistac, ...
Proceedings of the 2020 conference on empirical methods in natural language …, 2020
56822020
Huggingface's transformers: State-of-the-art natural language processing
T Wolf
arXiv preprint arXiv:1910.03771, 2019
33782019
Character-aware neural language models
Y Kim, Y Jernite, D Sontag, A Rush
Proceedings of the AAAI conference on artificial intelligence 30 (1), 2016
22132016
Bloom: A 176b-parameter open-access multilingual language model
T Le Scao, A Fan, C Akiki, E Pavlick, S Ilić, D Hesslow, R Castagné, ...
16262023
Starcoder: may the source be with you!
R Li, LB Allal, Y Zi, N Muennighoff, D Kocetkov, C Mou, M Marone, C Akiki, ...
arXiv preprint arXiv:2305.06161, 2023
7172023
ELI5: Long form question answering
A Fan, Y Jernite, E Perez, D Grangier, J Weston, M Auli
arXiv preprint arXiv:1907.09190, 2019
5322019
KILT: a benchmark for knowledge intensive language tasks
F Petroni, A Piktus, A Fan, P Lewis, M Yazdani, N De Cao, J Thorne, ...
arXiv preprint arXiv:2009.02252, 2020
5002020
Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning
S Horng, DA Sontag, Y Halpern, Y Jernite, NI Shapiro, LA Nathanson
PloS one 12 (4), e0174708, 2017
3062017
Datasets: A community library for natural language processing
Q Lhoest, AV Del Moral, Y Jernite, A Thakur, P Von Platen, S Patil, ...
arXiv preprint arXiv:2109.02846, 2021
2752021
Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander Rush, and Thomas Wolf. 2021. Datasets: A community library for …
Q Lhoest, AV Del Moral, Y Jernite, A Thakur, P Von Platen, S Patil, ...
Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021
2592021
The stack: 3 tb of permissively licensed source code
D Kocetkov, R Li, LB Allal, J Li, C Mou, CM Ferrandis, Y Jernite, M Mitchell, ...
arXiv preprint arXiv:2211.15533, 2022
2412022
SantaCoder: don't reach for the stars!
LB Allal, R Li, D Kocetkov, C Mou, C Akiki, CM Ferrandis, N Muennighoff, ...
arXiv preprint arXiv:2301.03988, 2023
2072023
The bigscience roots corpus: A 1.6 tb composite multilingual dataset
H Laurençon, L Saulnier, T Wang, C Akiki, A Villanova del Moral, ...
Advances in Neural Information Processing Systems 35, 31809-31826, 2022
1702022
The gem benchmark: Natural language generation, its evaluation and metrics
S Gehrmann, T Adewumi, K Aggarwal, PS Ammanamanchi, ...
arXiv preprint arXiv:2102.01672, 2021
1522021
Starcoder 2 and the stack v2: The next generation
A Lozhkov, R Li, LB Allal, F Cassano, J Lamy-Poirier, N Tazi, A Tang, ...
arXiv preprint arXiv:2402.19173, 2024
1412024
Stable bias: Analyzing societal representations in diffusion models
AS Luccioni, C Akiki, M Mitchell, Y Jernite
arXiv preprint arXiv:2303.11408, 2023
1392023
Quality at a glance: An audit of web-crawled multilingual datasets
J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ...
Transactions of the Association for Computational Linguistics 10, 50-72, 2022
1362022
Power hungry processing: Watts driving the cost of AI deployment?
S Luccioni, Y Jernite, E Strubell
The 2024 ACM Conference on Fairness, Accountability, and Transparency, 85-99, 2024
1252024
Discourse-based objectives for fast unsupervised sentence representation learning
Y Jernite, SR Bowman, D Sontag
arXiv preprint arXiv:1705.00557, 2017
1142017
Nisansa de Silva
J Kreutzer, I Caswell, L Wang, A Wahab, D Van Esch, N Ulzii-Orshikh, ...
Sakine Çabuk Ballı, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur …, 2022
1022022
The system can't perform the operation now. Try again later.
Articles 1–20