Follow
Hao Sun
Hao Sun
Verified email at mails.tsinghua.edu.cn - Homepage
Title
Cited by
Cited by
Year
Safety assessment of chinese large language models
H Sun, Z Zhang, J Deng, J Cheng, M Huang
arXiv preprint arXiv:2304.10436, 2023
502023
On the safety of conversational models: Taxonomy, dataset, and benchmark
H Sun, G Xu, J Deng, J Cheng, C Zheng, H Zhou, N Peng, X Zhu, ...
arXiv preprint arXiv:2110.08466, 2021
492021
Psyqa: A chinese dataset for generating long counseling text for mental health support
H Sun, Z Lin, C Zheng, S Liu, M Huang
arXiv preprint arXiv:2106.01702, 2021
452021
Eva: An open-domain chinese dialogue system with large-scale generative pre-training
H Zhou, P Ke, Z Zhang, Y Gu, Y Zheng, C Zheng, Y Wang, CH Wu, H Sun, ...
arXiv preprint arXiv:2108.01547, 2021
442021
COLD: A benchmark for Chinese offensive language detection
J Deng, J Zhou, H Sun, C Zheng, F Mi, H Meng, M Huang
arXiv preprint arXiv:2201.06025, 2022
422022
Eva2. 0: Investigating open-domain chinese dialogue systems with large-scale pre-training
Y Gu, J Wen, H Sun, Y Song, P Ke, C Zheng, Z Zhang, J Yao, L Liu, X Zhu, ...
Machine Intelligence Research 20 (2), 207-219, 2023
342023
Recent advances towards safe, responsible, and moral dialogue systems: A survey
J Deng, H Sun, Z Zhang, J Cheng, M Huang
arXiv preprint arXiv:2302.09270 1, 2023
232023
Unveiling the implicit toxicity in large language models
J Wen, P Ke, H Sun, Z Zhang, C Li, J Bai, M Huang
arXiv preprint arXiv:2311.17391, 2023
122023
Pal: Persona-augmented emotional support conversation generation
J Cheng, S Sabour, H Sun, Z Chen, M Huang
arXiv preprint arXiv:2212.09235, 2022
92022
Constructing highly inductive contexts for dialogue safety through controllable reverse generation
Z Zhang, J Cheng, H Sun, J Deng, F Mi, Y Wang, L Shang, M Huang
arXiv preprint arXiv:2212.01810, 2022
62022
InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning
Z Zhang, J Cheng, H Sun, J Deng, M Huang
Findings of the Association for Computational Linguistics: EMNLP 2023, 10421 …, 2023
12023
Enhancing Offensive Language Detection with Data Augmentation and Knowledge Distillation
J Deng, Z Chen, H Sun, Z Zhang, J Wu, S Nakagawa, F Ren, M Huang
Research 6, 0189, 2023
12023
Moraldial: A framework to train and evaluate moral dialogue systems via constructing moral discussions
H Sun, Z Zhang, F Mi, Y Wang, W Liu, J Cui, B Wang, Q Liu, M Huang
arXiv preprint arXiv:2212.10720, 2022
12022
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
Z Zhang, Y Lu, J Ma, D Zhang, R Li, P Ke, H Sun, L Sha, Z Sui, H Wang, ...
arXiv preprint arXiv:2402.16444, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–14