关注
Zhengfu He
标题
引用次数
引用次数
年份
Diffusionbert: Improving generative masked language models with diffusion models
Z He, T Sun, K Wang, X Huang, X Qiu
arXiv preprint arXiv:2211.15029, 2022
902022
BBTv2: Towards a Gradient-Free Future with Large Language Models
T Sun, Z He, H Qian, Y Zhou, XJ Huang, X Qiu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language …, 2022
83*2022
MOSS: An Open Conversational Large Language Model
T Sun, X Zhang, Z He, P Li, Q Cheng, X Liu, H Yan, Y Shao, Q Tang, ...
Machine Intelligence Research, 1-18, 2024
74*2024
Multitask pre-training of modular prompt for Chinese few-shot learning
T Sun, Z He, Q Zhu, X Qiu, X Huang
arXiv preprint arXiv:2210.07565, 2022
232022
Can AI Assistants Know What They Don't Know?
Q Cheng, T Sun, X Liu, W Zhang, Z Yin, S Li, L Li, Z He, K Chen, X Qiu
arXiv preprint arXiv:2401.13275, 2024
222024
Dictionary learning improves patch-free circuit discovery in mechanistic interpretability: A case study on othello-gpt
Z He, X Ge, Q Tang, T Sun, Q Cheng, X Qiu
arXiv preprint arXiv:2402.12201, 2024
14*2024
Competition for gradient-free tuning of large language models: approaches, results, current challenges and future directions
T Cao, L Chen, D Zhang, T Sun, Z He, X Qiu, X Xu, H Zhang
National Science Review 10 (6), nwad124, 2023
42023
Automatically Identifying Local and Global Circuits with Linear Computation Graphs
X Ge, F Zhu, W Shu, J Wang, Z He, X Qiu
arXiv preprint arXiv:2405.13868, 2024
32024
Towards Universality: Studying mechanistic similarity across language model architectures
J Wang, X Ge, W Shu, Q Tang, Y Zhou, Z He, X Qiu
arXiv preprint arXiv:2410.06672, 2024
22024
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Z He, W Shu, X Ge, L Chen, J Wang, Y Zhou, F Liu, Q Guo, X Huang, ...
arXiv preprint arXiv:2410.20526, 2024
2024
Generate Point Clouds with Multiscale Details from Graph-Represented Structures
X Yang, Z He, C Jin
arXiv preprint arXiv:2112.06433, 2021
2021
系统目前无法执行此操作,请稍后再试。
文章 1–11