Multi-agent actor-critic for mixed cooperative-competitive environments R Lowe, YI Wu, A Tamar, J Harb, OAI Pieter Abbeel, I Mordatch Advances in neural information processing systems 30, 2017 | 3325 | 2017 |
The option-critic architecture PL Bacon, J Harb, D Precup Proceedings of the AAAI conference on artificial intelligence 31 (1), 2017 | 981 | 2017 |
When waiting is not an option: Learning options with a deliberation cost J Harb, PL Bacon, M Klissarov, D Precup Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 119 | 2018 |
Investigating recurrence and eligibility traces in deep Q-networks J Harb, D Precup arXiv preprint arXiv:1704.05495, 2017 | 114 | 2017 |
Learnings options end-to-end for continuous action tasks M Klissarov, PL Bacon, J Harb, D Precup arXiv preprint arXiv:1712.00004, 2017 | 45 | 2017 |
Policy evaluation networks J Harb, T Schaul, D Precup, PL Bacon arXiv preprint arXiv:2002.11833, 2020 | 27 | 2020 |
The barbados 2018 list of open issues in continual learning T Schaul, H van Hasselt, J Modayil, M White, A White, PL Bacon, J Harb, ... arXiv preprint arXiv:1811.07004, 2018 | 11 | 2018 |
General policy evaluation and improvement by learning to identify few but crucial states F Faccio, A Ramesh, V Herrmann, J Harb, J Schmidhuber arXiv preprint arXiv:2207.01566, 2022 | 5 | 2022 |
Learning options in deep reinforcement learning J Merheb-Harb McGill University (Canada), 2017 | 1 | 2017 |
Asynchronous Advantage Option-Critic with Deliberation Cost J Harb, PL Bacon, D Precup RLDM, 2017 | | 2017 |