An overview of catastrophic ai risks D Hendrycks, M Mazeika, T Woodside arXiv preprint arXiv:2306.12001, 2023 | 79* | 2023 |
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark A Pan, CJ Shern, A Zou, N Li, S Basart, T Woodside, J Ng, H Zhang, ... International Conference on Machine Learning, 2023 | 66 | 2023 |
Artificial influence: An analysis of AI-driven persuasion M Burtell, T Woodside arXiv preprint arXiv:2303.08721, 2023 | 17 | 2023 |
MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding SH Wang, A Scardigli, L Tang, W Chen, D Levkin, A Chen, S Ball, ... Empirical Methods in Natural Language Processing, 2023 | 11 | 2023 |
Examples of AI improving AI T Woodside Retrieved September, 2023 | 4 | 2023 |
Responsible Reporting for Frontier AI Development N Kolt, M Anderljung, J Barnhart, A Brass, K Esvelt, GK Hadfield, L Heim, ... arXiv preprint arXiv:2404.02675, 2024 | | 2024 |
Multimodality, Tool Use, and Autonomous Agents: Large Language Models Explained, Part 3 T Woodside, H Toner Center for Security and Emerging Technology, 2024 | | 2024 |
How Developers Steer Language Model Outputs: Large Language Models Explained, Part 2 T Woodside, H Toner Center for Security and Emerging Technology, 2024 | | 2024 |
Keeping Up with the Frontier: Why Congress Should Codify Reporting Requirements For Advanced AI Systems T Woodside Center for Security and Emerging Technology, 2024 | | 2024 |