GDI: Rethinking What Makes Reinforcement Learning Different from Supervised Learning J Fan, C Xiao, Y Huang arXiv preprint arXiv:2106.06232, 2021 | 9 | 2021 |
A review for deep reinforcement learning in atari: Benchmarks, challenges, and solutions J Fan arXiv preprint arXiv:2112.04145, 2021 | 8 | 2021 |
Generalized Data Distribution Iteration J Fan, C Xiao International Conference on Machine Learning, ICML 2022, 17-23 July 2022 …, 2022 | 7 | 2022 |
Learnable behavior control: Breaking atari human world records via sample-efficient behavior selection J Fan, Y Zhuang, Y Liu, J Hao, B Wang, J Zhu, H Wang, ST Xia The Eleventh International Conference on Learning Representations, 2023 | 6 | 2023 |
An Entropy Regularization Free Mechanism for Policy-based Reinforcement Learning C Xiao, H Shi, J Fan, S Deng arXiv preprint arXiv:2106.00707, 2021 | 4 | 2021 |
CASA: A bridge between gradient of policy improvement and policy evaluation C Xiao, H Shi, J Fan, S Deng arXiv e-prints, arXiv: 2105.03923, 2021 | 4 | 2021 |
Critic PI2: Master continuous planning via policy improvement with path integrals and deep actor-critic reinforcement learning J Fan, H Ba, X Guo, J Hao arXiv preprint arXiv:2011.06752, 2020 | 4 | 2020 |
Entire Space Counterfactual Learning: Tuning, Analytical Properties and Industrial Applications H Wang, Z Chen, J Fan, Y Huang, W Liu, X Liu arXiv preprint arXiv:2210.11039, 2022 | 2 | 2022 |
Optimal Transport for Treatment Effect Estimation H Wang, Z Chen, J Fan, H Li, T Liu, W Liu, Q Dai, Y Wang, Z Dong, ... arXiv preprint arXiv:2310.18286, 2023 | 1 | 2023 |
Convformer: Revisiting transformer for sequential user modeling H Wang, J Lian, M Wu, H Li, J Fan, W Xu, C Li, X Xie arXiv preprint arXiv:2308.02925, 2023 | 1 | 2023 |
Sinkhorn Discrepancy for Counterfactual Generalization H Wang, Q Dai, J Fan, W Liu, Z Chen, T Liu, Y Wang, Z Dong, R Tang | | 2022 |