Factual probing is [mask]: Learning vs. learning to recall Z Zhong, D Friedman, D Chen arXiv preprint arXiv:2104.05240, 2021 | 398 | 2021 |
Scisummnet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks M Yasunaga, J Kasai, R Zhang, AR Fabbri, I Li, D Friedman, DR Radev Proceedings of the AAAI conference on artificial intelligence 33 (01), 7386-7393, 2019 | 233 | 2019 |
Embers of autoregression: Understanding large language models through the problem they are trained to solve RT McCoy, S Yao, D Friedman, M Hardy, TL Griffiths arXiv preprint arXiv:2309.13638, 2023 | 112 | 2023 |
The vendi score: A diversity evaluation metric for machine learning D Friedman, AB Dieng arXiv preprint arXiv:2210.02410, 2022 | 78 | 2022 |
Syntax-aware neural semantic role labeling with supertags J Kasai, D Friedman, R Frank, D Radev, O Rambow arXiv preprint arXiv:1903.05260, 2019 | 43 | 2019 |
Learning transformer programs D Friedman, A Wettig, D Chen Advances in Neural Information Processing Systems 36, 2024 | 35 | 2024 |
Measuring inductive biases of in-context learning with underspecified demonstrations C Si, D Friedman, N Joshi, S Feng, D Chen, H He arXiv preprint arXiv:2305.13299, 2023 | 31 | 2023 |
Single-dataset experts for multi-dataset question answering D Friedman, B Dodge, D Chen arXiv preprint arXiv:2109.13880, 2021 | 26 | 2021 |
Finding dataset shortcuts with grammar induction D Friedman, A Wettig, D Chen arXiv preprint arXiv:2210.11560, 2022 | 11 | 2022 |
Embers of autoregression show how large language models are shaped by the problem they are trained to solve RT McCoy, S Yao, D Friedman, MD Hardy, TL Griffiths Proceedings of the National Academy of Sciences 121 (41), e2322420121, 2024 | 10 | 2024 |
Interpretability illusions in the generalization of simplified models D Friedman, A Lampinen, L Dixon, D Chen, A Ghandeharioun arXiv preprint arXiv:2312.03656, 2023 | 8 | 2023 |
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models A Bhaskar, D Friedman, D Chen arXiv preprint arXiv:2403.03942, 2024 | 7 | 2024 |
Linguistically rich vector representations of supertags for TAG parsing D Friedman, J Kasai, RT McCoy, R Frank, F Davis, O Rambow Proceedings of the 13th International Workshop on Tree Adjoining Grammars …, 2017 | 4 | 2017 |
What Spurious Features Can Pretrained Language Models Combat? C Si, D Friedman, N Joshi, S Feng, D Chen, H He | 3 | 2023 |
Finding transformer circuits with edge pruning A Bhaskar, A Wettig, D Friedman, D Chen arXiv preprint arXiv:2406.16778, 2024 | 2 | 2024 |
Comparing Representational and Functional Similarity in Small Transformer Language Models D Friedman, AK Lampinen, L Dixon, D Chen, A Ghandeharioun UniReps: the First Workshop on Unifying Representations in Neural Models, 2023 | 2 | 2023 |
When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 RT McCoy, S Yao, D Friedman, MD Hardy, TL Griffiths arXiv preprint arXiv:2410.01792, 2024 | 1 | 2024 |
Representing rule-based chatbots with transformers D Friedman, A Panigrahi, D Chen arXiv preprint arXiv:2407.10949, 2024 | 1 | 2024 |
Continual Memorization of Factoids in Large Language Models H Chen, J Geng, A Bhaskar, D Friedman, D Chen arXiv preprint arXiv:2411.07175, 2024 | | 2024 |
A Neural Network Approach to Value-at-Risk Forecasting D Friedman, A Matell | | 2024 |