关注
Jakub Grudzien Kuba
Jakub Grudzien Kuba
在 berkeley.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Trust region policy optimisation in multi-agent reinforcement learning
JG Kuba, R Chen, M Wen, Y Wen, F Sun, J Wang, Y Yang
International Conference on Learning Representations 2022, 2021
2262021
Multi-agent reinforcement learning is a sequence modeling problem
M Wen, J Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 35, 16509-16521, 2022
1672022
Safe multi-agent reinforcement learning for multi-robot control
S Gu, JG Kuba, Y Chen, Y Du, L Yang, A Knoll, Y Yang
Artificial Intelligence 319, 103905, 2023
95*2023
Idql: Implicit q-learning as an actor-critic method with diffusion policies
P Hansen-Estruch, I Kostrikov, M Janner, JG Kuba, S Levine
arXiv preprint arXiv:2304.10573, 2023
912023
Discovered policy optimisation
C Lu, J Kuba, A Letcher, L Metz, C Schroeder de Witt, J Foerster
Advances in Neural Information Processing Systems 35, 16455-16468, 2022
662022
Settling the variance of multi-agent policy gradients
JG Kuba, M Wen, L Meng, H Zhang, D Mguni, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 13458-13470, 2021
582021
Heterogeneous-agent mirror learning: A continuum of solutions to cooperative marl
JG Kuba, X Feng, S Ding, H Dong, J Wang, Y Yang
arXiv preprint arXiv:2208.01682, 2022
41*2022
Mirror learning: A unifying framework of policy optimisation
J Grudzien, CAS De Witt, J Foerster
International Conference on Machine Learning, 7825-7844, 2022
23*2022
Understanding value decomposition algorithms in deep cooperative multi-agent reinforcement learning
Z Dou, JG Kuba, Y Yang
arXiv preprint arXiv:2202.04868, 2022
82022
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
K Grudzien, M Uehara, S Levine, P Abbeel
International Conference on Artificial Intelligence and Statistics, 2449-2457, 2024
32024
Cliqueformer: Model-Based Optimization with Structured Transformers
JG Kuba, P Abbeel, S Levine
arXiv preprint arXiv:2410.13106, 2024
2024
Advantage-Conditioned Diffusion: Offline RL via Generalization
JG Kuba, P Abbeel, S Levine
系统目前无法执行此操作,请稍后再试。
文章 1–12