Peter Hase

Cited by

	All	Since 2019
Citations	1098	1097
h-index	13	13
i10-index	15	15

440

220

110

330

2019202020212022202320243 22 108 237 428 295

Public access

View all

3 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Mohit BansalParker Distinguished Professor, Computer Science, UNC Chapel HillVerified email at cs.unc.edu
Cynthia RudinProfessor of Computer Science, ECE, Statistics, and Biostatistics & Bioinformatics, Duke UniversityVerified email at cs.duke.edu
Swarnadeep SahaPhD Student, University of North Carolina at Chapel HillVerified email at cs.unc.edu
Shiyue ZhangUNC Chapel HillVerified email at cs.unc.edu
Srini IyerFAIRVerified email at fb.com
Asma GhandehariounResearch Scientist, Google ResearchVerified email at google.com
Been KimGoogle DeepMindVerified email at csail.mit.edu
Zhuofan YingColumbia UniversityVerified email at columbia.edu
Peter ClarkAllen Institute for Artificial Intelligence (AI2)Verified email at allenai.org
Sarah WiegreffeAllen Institute for AI & University of WashingtonVerified email at allenai.org

Peter Hase

PhD Student, University of North Carolina at Chapel Hill

Verified email at cs.unc.edu - Homepage

Interpretable Machine Learning Natural Language Processing


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Evaluating explainable AI: Which algorithmic explanations help users predict model behavior? P Hase, M Bansal arXiv preprint arXiv:2005.01831, 2020	256	2020
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023	154	2023
Interpretable image recognition with hierarchical prototypes P Hase, C Chen, O Li, C Rudin Proceedings of the AAAI Conference on Human Computation and Crowdsourcing 7 …, 2019	104	2019
Grips: Gradient-free, edit-based instruction search for prompting large language models A Prasad, P Hase, X Zhou, M Bansal arXiv preprint arXiv:2203.07281, 2022	97	2022
Do language models have beliefs? methods for detecting, updating, and visualizing model beliefs P Hase, M Diab, A Celikyilmaz, X Li, Z Kozareva, V Stoyanov, M Bansal, ... arXiv preprint arXiv:2111.13654, 2021	78*	2021
Leakage-adjusted simulatability: Can models generate non-trivial explanations of their behavior in natural language? P Hase, S Zhang, H Xie, M Bansal arXiv preprint arXiv:2010.04119, 2020	76	2020
Fastif: Scalable influence functions for efficient model interpretation and debugging H Guo, NF Rajani, P Hase, M Bansal, C Xiong arXiv preprint arXiv:2012.15781, 2020	73	2020
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations P Hase, H Xie, M Bansal Advances in Neural Information Processing Systems 34, 2021	64	2021
When can models learn from explanations? a formal framework for understanding the roles of explanation data P Hase, M Bansal arXiv preprint arXiv:2102.02201, 2021	61	2021
Does localization inform editing? surprising differences in causality-based localization vs. knowledge editing in language models P Hase, M Bansal, B Kim, A Ghandeharioun Advances in Neural Information Processing Systems 36, 2024	50	2024
Summarization programs: Interpretable abstractive summarization with neural modular trees S Saha, S Zhang, P Hase, M Bansal arXiv preprint arXiv:2209.10492, 2022	15	2022
Can Language Models Teach? Teacher Explanations Improve Student Performance via Personalization S Saha, P Hase, M Bansal Advances in Neural Information Processing Systems 36, 2024	13*	2024
Low-cost algorithmic recourse for users with uncertain cost functions P Yadav, P Hase, M Bansal arXiv preprint arXiv:2111.01235, 2021	13	2021
Can sensitive information be deleted from llms? objectives for defending against extraction attacks V Patil, P Hase, M Bansal arXiv preprint arXiv:2309.17410, 2023	12	2023
Rethinking Machine Unlearning for Large Language Models S Liu, Y Yao, J Jia, S Casper, N Baracaldo, P Hase, X Xu, Y Yao, H Li, ... arXiv preprint arXiv:2402.08787, 2024	10	2024
Visfis: Visual feature importance supervision with right-for-the-right-reason objectives Z Ying, P Hase, M Bansal Advances in Neural Information Processing Systems 35, 17057-17072, 2022	9	2022
Are hard examples also harder to explain? a study with human and model-generated explanations S Saha, P Hase, N Rajani, M Bansal arXiv preprint arXiv:2211.07517, 2022	7	2022
Shall i compare thee to a machine-written sonnet? an approach to algorithmic sonnet generation J Benhardt, P Hase, L Zhu, C Rudin arXiv preprint arXiv:1811.05067, 2018	5	2018
The unreasonable effectiveness of easy training data for hard tasks P Hase, M Bansal, P Clark, S Wiegreffe arXiv preprint arXiv:2401.06751, 2024	1	2024
Foundational Challenges in Assuring Alignment and Safety of Large Language Models U Anwar, A Saparov, J Rando, D Paleka, M Turpin, P Hase, ES Lubana, ... arXiv preprint arXiv:2404.09932, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors