Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models P Manakul, A Liusie, MJF Gales EMNLP 2023, 2023 | 474 | 2023 |
Zero-shot NLG evaluation through Pairware Comparisons with LLMs A Liusie, P Manakul, MJF Gales EACL 2024, 2023 | 48 | 2023 |
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization P Manakul, A Liusie, MJF Gales IJCNLP-AACL 2023, 2023 | 28 | 2023 |
Rewarding Chatbots for Real-World Engagement with Millions of Users R Irvine, D Boubert, V Raina, A Liusie, V Mudupalli, A Korshuk, Z Liu, ... arXiv preprint arXiv:2303.06135, 2023 | 16 | 2023 |
Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment V Raina, A Liusie, M Gales arXiv preprint arXiv:2402.14016, 2024 | 14 | 2024 |
CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models P Manakul, Y Fathullah, A Liusie, V Raina, V Raina, M Gales BioNLP Workshop @ ACL 2023, 2023 | 13 | 2023 |
Analyzing Biases to Spurious Correlations in Text Classification Tasks A Liusie, V Raina, V Raina, M Gales IJCNLP-AACL 2022, 2022 | 10 | 2022 |
Mitigating Word Bias in Zero-shot Prompt-based Classifiers A Liusie, P Manakul, MJF Gales IJCNLP-AACL 2023, 2023 | 7 | 2023 |
The Cambridge Multiple-Choice Questions Reading Dataset A Mullooly, Ø Andersen, L Benedetto, P Buttery, A Caines, MJF Gales, ... Cambridge University Press and Assessment, 2023 | 7 | 2023 |
Blending is all you need: Cheaper, better alternative to trillion-parameters llm X Lu, Z Liu, A Liusie, V Raina, V Mudupalli, Y Zhang, W Beauchamp arXiv preprint arXiv:2401.02994, 2024 | 6 | 2024 |
Investigating the Emergent Audio Classification Ability of ASR Foundation Models R Ma, A Liusie, MJF Gales, KM Knill NAACL 2024, 2023 | 5 | 2023 |
Analysis of the Cambridge multiple-choice questions reading dataset with a focus on candidate response distribution A Liusie, V Raina, A Mullooly, K Knill, MJF Gales arXiv e-prints, arXiv: 2306.13047, 2023 | 5 | 2023 |
WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models P Molenda, A Liusie, MJF Gales NAACL 2024 (findings), 2024 | 4 | 2024 |
Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models A Liusie, Y Fathullah, MJF Gales arXiv preprint arXiv:2403.13590, 2024 | 4 | 2024 |
" World Knowledge" in Multiple Choice Reading Comprehension A Liusie, V Raina, M Gales Proceedings of the Sixth Fact Extraction and VERification Workshop (FEVER ¡K, 2022 | 4 | 2022 |
Automatic Assessment of Conversational Speaking Tests SW McKnight, A Civelekoglu, MJF Gales, S Bannò, A Liusie, KM Knill Proceedings of 9th Workshop on Speech and Language Technology in Education ¡K, 2023 | 3 | 2023 |
Efficient LLM Comparative Assessment: a Product of Experts Framework for Pairwise Comparisons A Liusie, V Raina, Y Fathullah, M Gales arXiv preprint arXiv:2405.05894, 2024 | 1 | 2024 |
Camchoice: A corpus of multiple choice questions and candidate response distributions A Liusie, V Raina, A Mullooly, K Knill, MJF Gales arXiv preprint arXiv:2306.13047, 2023 | 1 | 2023 |
Who Needs Decoders? Efficient Estimation of Sequence-level Attributes Y Fathullah, P Radmard, A Liusie, MJF Gales arXiv preprint arXiv:2305.05098, 2023 | 1 | 2023 |
UNIVERSITY OF CAMBRIDGE AT TREC CAST 2022 A Liusie, M Qian, X Li, M Gales The 30th Text Retrieval Conference (TREC), 2022 | 1 | 2022 |