Follow
Daan van Esch
Title
Cited by
Cited by
Year
Quality at a glance: An audit of web-crawled multilingual datasets
J Kreutzer, I Caswell, L Wang, A Wahab, D van Esch, N Ulzii-Orshikh, ...
Transactions of the Association for Computational Linguistics 10, 50-72, 2022
882022
Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus
I Caswell, T Breiner, D van Esch, A Bapna
arXiv preprint arXiv:2010.14571, 2020
672020
Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System
B Foley, J Arnold, R Coto-Solano, G Durantin, TM Ellison, D van Esch, ...
Proceedings of the 6th International Workshop on Spoken Language …, 2018
672018
Building Machine Translation Systems for the Next Thousand Languages
A Bapna, I Caswell, J Kreutzer, O Firat, D van Esch, A Siddhant, M Niu, ...
arXiv preprint arXiv:2205.03983, 2022
552022
How Might We Create Better Benchmarks for Speech Recognition?
A Aksënova, D van Esch, J Flynn, P Golik
Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, 22-34, 2021
312021
Future directions in technological support for language documentation
D van Esch, B Foley, N San
Proceedings of the Workshop on Computational Methods for Endangered Languages 1, 2019
192019
Leiden Weibo Corpus
D van Esch
192012
Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard
D van Esch, E Sarbar, T Lucassen, J O'Brien, T Breiner, M Prasad, E Crew, ...
arXiv preprint arXiv:1912.01218, 2019
182019
An Expanded Taxonomy of Semiotic Classes for Text Normalization
D van Esch, R Sproat
Proceedings of Interspeech 2017, 4016-4020, 2017
182017
Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data
M Prasad, D van Esch, S Ritchie, JF Mortensen
Proc. Interspeech 2019, 271-275, 2019
172019
Writing system and speaker metadata for 2,800+ language varieties
D van Esch, T Lucassen, S Ruder, I Caswell, C Rivera
Proceedings of the Thirteenth Language Resources and Evaluation Conference …, 2022
162022
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
A Aksënova, Z Chen, CC Chiu, D van Esch, P Golik, W Han, L King, ...
arXiv preprint arXiv:2205.08014, 2022
162022
Text Normalization Infrastructure that Scales to Hundreds of Language Varieties
M Chua, D van Esch, N Coccaro, E Cho, S Bhandari, L Jia
Proceedings of the 11th edition of the Language Resources and Evaluation …, 2018
162018
Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks.
D van Esch, M Chua, K Rao
Proceedings of Interspeech 2016, 2841-2845, 2016
152016
Xtreme-s: Evaluating cross-lingual speech representations
A Conneau, A Bapna, Y Zhang, M Ma, P von Platen, A Lozhkov, C Cherry, ...
arXiv preprint arXiv:2203.10752, 2022
132022
Mining Training Data for Language Modeling across the World’s Languages
M Prasad, T Breiner, D van Esch
Proceedings of the 6th International Workshop on Spoken Language …, 2018
112018
Unified Verbalization for Speech Recognition & Synthesis Across Languages
S Ritchie, R Sproat, K Gorman, D van Esch, C Schallhart, N Bampounis, ...
Proc. Interspeech 2019, 3530-3534, 2019
92019
Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
S Ritchie, YC Cheng, M Chen, R Mathews, D van Esch, B Li, KC Sim
arXiv preprint arXiv:2208.03067, 2022
62022
Developing Pronunciation Models in New Languages Faster by Exploiting Common Grapheme-to-Phoneme Correspondences Across Languages
H Bleyan, S Ritchie, JF Mortensen, D van Esch
Proc. Interspeech 2019, 2100-2104, 2019
62019
Data-Driven Parametric Text Normalization: Rapidly Scaling Finite-State Transduction Verbalizers to New Languages
S Ritchie, E Mahon, K Heiligenstein, N Bampounis, D van Esch, ...
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for …, 2020
42020
The system can't perform the operation now. Try again later.
Articles 1–20