Metacure: Meta reinforcement learning with empowerment-driven exploration J Zhang, J Wang, H Hu, T Chen, Y Chen, C Fan, C Zhang Thirty-eighth International Conference on Machine Learning (ICML 2021 …, 2021 | 48* | 2021 |
Offline Reinforcement Learning with Value-based Episodic Memory X Ma*, Y Yang*, H Hu*, Q Liu, J Yang, C Zhang, Q Zhao, B Liang Tenth International Conference on Learning Representations (ICLR 2022), 2021 | 45 | 2021 |
Generalizable episodic memory for deep reinforcement learning H Hu, J Ye, G Zhu, Z Ren, C Zhang Thirty-eighth International Conference on Machine Learning (ICML 2021), 2021 | 42 | 2021 |
Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu, Z Wang arXiv preprint arXiv:2309.17382, 2023 | 35* | 2023 |
Maximize to explore: One objective function fusing estimation, planning, and exploration Z Liu, M Lu, W Xiong, H Zhong, H Hu, S Zhang, S Zheng, Z Yang, Z Wang Advances in Neural Information Processing Systems 36, 2024 | 32* | 2024 |
On the Estimation Bias in Double Q-Learning Z Ren, G Zhu, H Hu, B Han, J Chen, C Zhang Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS …, 2021 | 22 | 2021 |
On the Role of Discount Factor in Offline Reinforcement Learning H Hu, Y Yang, Q Zhao, C Zhang Thirty-ninth International Conference on Machine Learning (ICML 2022), 2022 | 19 | 2022 |
What is essential for unseen goal generalization of offline goal-conditioned rl? R Yang, L Yong, X Ma, H Hu, C Zhang, T Zhang International Conference on Machine Learning, 39543-39571, 2023 | 18 | 2023 |
Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery Y Yang*, H Hu*, W Li*, S Li, J Yang, Q Zhao, C Zhang Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023, 2022 | 14 | 2022 |
The provable benefits of unsupervised data sharing for offline reinforcement learning H Hu, Y Yang, Q Zhao, C Zhang arXiv preprint arXiv:2302.13493, 2023 | 12 | 2023 |
Unsupervised behavior extraction via random intent priors H Hu, Y Yang, J Ye, Z Mai, C Zhang Advances in Neural Information Processing Systems 36, 51491-51514, 2023 | 5 | 2023 |
Stylized offline reinforcement learning: Extracting diverse high-quality behaviors from heterogeneous datasets Y Mao, C Wu, X Chen, H Hu, J Jiang, T Zhou, T Lv, C Fan, Z Hu, Y Wu, ... The Twelfth International Conference on Learning Representations, 2024 | 4 | 2024 |
Bayesian Design Principles for Offline-to-Online Reinforcement Learning H Hu, Y Yang, J Ye, C Wu, Z Mai, Y Hu, T Lv, C Fan, Q Zhao, C Zhang arXiv preprint arXiv:2405.20984, 2024 | 1 | 2024 |
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners C Wu, H Hu, Y Yang, N Zhang, C Zhang Forty-first International Conference on Machine Learning, 2024 | | 2024 |
Query-Efficient Offline Preference-Based Reinforcement Learning via In-Dataset Exploration H Hu, Y Yang, J Zhang, S Wang, B Liu, Y Gao, C Zhang | | |