Ziniu Li

Cited by

	All	Since 2019
Citations	266	266
h-index	9	9
i10-index	7	7

140

105

2019202020212022202320241 3 14 40 75 132

Public access

View all

4 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Tian XuNanjing UniversityVerified email at lamda.nju.edu.cn
Yang YuProfessor, Nanjing UniversityVerified email at nju.edu.cn
Zhi-Quan LuoProfessor, The Chinese University of Hong Kong, Shenzhen, ChinaVerified email at cuhk.edu.cn
Yushun ZhangThe Chinese University of Hong Kong, Shenzhen, ChinaVerified email at link.cuhk.edu.cn
Yingru LiThe Chinese University of Hong Kong, Shenzhen, ChinaVerified email at link.cuhk.edu.cn
Tong ZhangUIUCVerified email at tongzhang-ml.org
Congliang ChenPh.D. Student, the Chinese University of Hong Kong (Shenzhen)Verified email at link.cuhk.edu.cn
Tian DingShenzhen Research Institute of Big DataVerified email at sribd.cn
Jiancong XiaoUniversity of PennsylvaniaVerified email at upenn.edu
Zeyu QinHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Weijie SuAssociate Professor, University of PennsylvaniaVerified email at wharton.upenn.edu
Ruoyu SunChinese University of Hong Kong (Shenzhen), Shenzhen Institue of Big Data

Ziniu Li

Other namesZi-Niu Li

The Chinese University of Hong Kong, Shenzhen

Verified email at link.cuhk.edu.cn - Homepage

Machine Learning Reinforcement Learning Large Language Models


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Error bounds of imitating policies and environments T Xu, Z Li, Y Yu Advances in Neural Information Processing Systems 33, 15737-15749, 2020	101	2020
Error bounds of imitating policies and environments for reinforcement learning T Xu, Z Li, Y Yu IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021	36	2021
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models Z Li, T Xu, Y Zhang, Z Lin, Y Yu, R Sun, ZQ Luo Forty-first International Conference on Machine Learning, 2024	24*	2024
Self-Guided Evolution Strategies with Historical Estimated Gradients FY Liu, ZN Li, C Qian IJCAI, 1474-1480, 2020	21	2020
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning Z Li, Y Li, Y Zhang, T Zhang, ZQ Luo International Conference on Learning Representations, 2022	16	2022
Rethinking ValueDice - Does It Really Improve Performance? Z Li, T Xu, Y Yu, ZQ Luo ICLR Blog, 2022	14	2022
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis T Xu, Z Li, Y Yu, ZQ Luo arXiv preprint arXiv:2208.01899, 2022	11*	2022
When is RL better than DPO in RLHF? A Representation and Optimization Perspective Z Li, T Xu, Y Yu ICLR Tiny Paper, 2024	9*	2024
Why transformers need adam: A hessian perspective Y Zhang, C Chen, T Ding, Z Li, R Sun, ZQ Luo arXiv preprint arXiv:2402.16788, 2024	9	2024
Imitation learning from imperfection: Theoretical justifications and algorithms Z Li, T Xu, Z Qin, Y Yu, ZQ Luo Advances in Neural Information Processing Systems 36, 2024	7*	2024
Provably Efficient Adversarial Imitation Learning with Unknown Transitions T Xu, Z Li, Y Yu, ZQ Luo UAI, 2367-2378, 2023	7	2023
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization J Xiao, Z Li, X Xie, E Getzen, C Fang, Q Long, WJ Su arXiv preprint arXiv:2405.16455, 2024	6	2024
Adam-mini: Use fewer learning rates to gain more Y Zhang, C Chen, Z Li, T Ding, C Wu, Y Ye, ZQ Luo, R Sun arXiv preprint arXiv:2406.16793, 2024	3	2024
A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle Z Li, T Xu, Y Yu arXiv preprint arXiv:2203.11489, 2022	1	2022
Efficient Exploration by Novelty-Pursuit Z Li, XH Chen Distributed Artificial Intelligence: Second International Conference, DAI …, 2020	1	2020
Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity Z Li, C Chen, T Xu, Z Qin, J Xiao, R Sun, ZQ Luo arXiv preprint arXiv:2408.16673, 2024		2024
Sensing Jamming Strategy from Limited Observations: An Imitation Learning Perspective Y Fan, B Jiu, W Pu, Z Li, K Li, H Liu IEEE Transactions on Signal Processing, 2024		2024
BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation C Jia, P Wang, Z Li, YC Li, Z Zhang, N Tang, Y Yu arXiv preprint arXiv:2405.17039, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–18

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors