Han Zhong

Cited by

	All	Since 2019
Citations	472	472
h-index	13	13
i10-index	14	14

280

140

210

20212022202320245 40 147 278

Public access

View all

5 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Liwei WangProfessor, Peking UniversityVerified email at cis.pku.edu.cn
Tong ZhangUIUCVerified email at tongzhang-ml.org
Wei XiongComputer Science, University of Illinois Urbana-ChampaignVerified email at illinois.edu
Zhaoran WangAssistant Professor at Northwestern UniversityVerified email at northwestern.edu
Zhuoran YangYale UniversityVerified email at yale.edu
Simon Shaolei DuAssistant Professor, School of Computer Science and Engineering, University of WashingtonVerified email at cs.washington.edu
Yunchang YangPeking UniversityVerified email at pku.edu.cn
Tianhao WuUniversity of California, BerkeleyVerified email at berkeley.edu
Chengshuai ShiElectrical and Computer Engineering, University of VirginiaVerified email at virginia.edu
Cong ShenAssociate Professor, University of VirginiaVerified email at virginia.edu
Hanze DongSalesforce ResearchVerified email at salesforce.com
Chenlu YeHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Michael I. JordanProfessor of Electrical Engineering and Computer Sciences and Professor of Statistics, UC BerkeleyVerified email at cs.berkeley.edu
Shenao ZhangNorthwestern UniversityVerified email at gatech.edu
Xiaoyu ChenPeking UniversityVerified email at pku.edu.cn
Jose BlanchetStanford UniversityVerified email at stanford.edu
Rui YangHong Kong University of Science and TechnologyVerified email at connect.ust.hk
Jiyuan TanStanford UniversityVerified email at stanford.edu
Lin F. Yang (杨林)Assistant Professor, Department of Electrical and Computer Engineering @ UCLAVerified email at ee.ucla.edu
Jiayi HuangPeking UniversityVerified email at stu.pku.edu.cn

Han Zhong

Peking University

Verified email at stu.pku.edu.cn - Homepage

Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond H Zhong, W Xiong, S Zheng, L Wang, Z Wang, Z Yang, T Zhang arXiv preprint arXiv:2211.01962, 2022	52*	2022
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation X Chen, H Zhong, Z Yang, Z Wang, L Wang International Conference on Machine Learning, 3773-3793, 2022	47	2022
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint W Xiong, H Dong, C Ye, Z Wang, H Zhong, H Ji, N Jiang, T Zhang ICLR 2024 Workshop on Mathematical and Empirical Understanding of Foundation …, 2023	46*	2023
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game W Xiong, H Zhong, C Shi, C Shen, L Wang, T Zhang arXiv preprint arXiv:2205.15512, 2022	42	2022
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers? H Zhong, Z Yang, Z Wang, MI Jordan Journal of Machine Learning Research 24 (35), 1-52, 2023	41*	2023
Pessimistic minimax value iteration: Provably efficient equilibrium learning from offline datasets H Zhong, W Xiong, J Tan, L Wang, T Zhang, Z Wang, Z Yang International Conference on Machine Learning, 27117-27142, 2022	40	2022
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration Z Liu, M Lu, W Xiong, H Zhong, H Hu, S Zhang, S Zheng, Z Yang, Z Wang Thirty-seventh Conference on Neural Information Processing Systems, 2023	26*	2023
A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games W Xiong, H Zhong, C Shi, C Shen, T Zhang International Conference on Machine Learning, 24496-24523, 2022	25	2022
A theoretical analysis of optimistic proximal policy optimization in linear markov decision processes H Zhong, T Zhang Advances in Neural Information Processing Systems 36, 2024	22	2024
Why robust generalization in deep learning is difficult: Perspective of expressive power B Li, J Jin, H Zhong, J Hopcroft, L Wang Advances in Neural Information Processing Systems 35, 4370-4384, 2022	21	2022
Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage J Blanchet, M Lu, T Zhang, H Zhong Advances in Neural Information Processing Systems 36, 2024	18	2024
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs H Zhong, Z Yang, Z Wang, C Szepesvári arXiv preprint arXiv:2110.08984, 2021	18	2021
Nearly optimal policy optimization with stable at any time guarantee T Wu, Y Yang, H Zhong, L Wang, S Du, J Jiao International Conference on Machine Learning, 24243-24265, 2022	14	2022
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment R Yang, X Pan, F Luo, S Qiu, H Zhong, D Yu, J Chen arXiv preprint arXiv:2402.10207, 2024	10	2024
DPO Meets PPO: Reinforced Token Optimization for RLHF H Zhong, G Feng, W Xiong, L Zhao, D He, J Bian, L Wang arXiv preprint arXiv:2404.18922, 2024	9	2024
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption R Yang, H Zhong, J Xu, A Zhang, C Zhang, L Han, T Zhang arXiv preprint arXiv:2310.12955, 2023	8	2023
Tackling heavy-tailed rewards in reinforcement learning with function approximation: Minimax optimal and instance-dependent regret bounds J Huang, H Zhong, L Wang, L Yang Advances in Neural Information Processing Systems 36, 2024	6	2024
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations J Hu, H Zhong, C Jin, L Wang arXiv preprint arXiv:2210.15598, 2022	6	2022
Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs H Zhong, J Huang, L Yang, L Wang Advances in Neural Information Processing Systems 34, 2021	6	2021
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning Y Yang, T Wu, H Zhong, E Garcelon, M Pirotta, A Lazaric, L Wang, SS Du International Conference on Learning Representations, 2021/9/29, 2021	6*	2021

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors