Stochastic variance-reduced policy gradient M Papini, D Binaghi, G Canonaco, M Pirotta, M Restelli International conference on machine learning, 4026-4035, 2018 | 196 | 2018 |
Exploration-exploitation in constrained mdps Y Efroni, S Mannor, M Pirotta arXiv preprint arXiv:2003.02189, 2020 | 169 | 2020 |
Frequentist regret bounds for randomized least-squares value iteration A Zanette, D Brandfonbrener, E Brunskill, M Pirotta, A Lazaric International Conference on Artificial Intelligence and Statistics, 1954-1964, 2020 | 148 | 2020 |
Safe policy iteration M Pirotta, M Restelli, A Pecorino, D Calandriello International conference on machine learning, 307-315, 2013 | 131 | 2013 |
Efficient bias-span-constrained exploration-exploitation in reinforcement learning R Fruit, M Pirotta, A Lazaric, R Ortner International Conference on Machine Learning, 1578-1586, 2018 | 117 | 2018 |
Policy gradient in lipschitz markov decision processes M Pirotta, M Restelli, L Bascetta Machine Learning 100, 255-283, 2015 | 104 | 2015 |
Adaptive step-size for policy gradient methods M Pirotta, M Restelli, L Bascetta Advances in Neural Information Processing Systems 26, 2013 | 91 | 2013 |
Policy gradient approaches for multi-objective sequential decision making S Parisi, M Pirotta, N Smacchia, L Bascetta, M Restelli 2014 International Joint Conference on Neural Networks (IJCNN), 2323-2330, 2014 | 80 | 2014 |
Multi-objective reinforcement learning with continuous pareto frontier approximation M Pirotta, S Parisi, M Restelli Proceedings of the AAAI conference on artificial intelligence 29 (1), 2015 | 79 | 2015 |
Adversarial attacks on linear contextual bandits E Garcelon, B Roziere, L Meunier, J Tarbouriech, O Teytaud, A Lazaric, ... Advances in Neural Information Processing Systems 33, 14362-14373, 2020 | 61 | 2020 |
Importance weighted transfer of samples in reinforcement learning A Tirinzoni, A Sessa, M Pirotta, M Restelli International Conference on Machine Learning, 4936-4945, 2018 | 61 | 2018 |
Multi-objective reinforcement learning through continuous pareto manifold approximation S Parisi, M Pirotta, M Restelli Journal of Artificial Intelligence Research 57, 187-227, 2016 | 61 | 2016 |
Inverse reinforcement learning through policy gradient minimization M Pirotta, M Restelli Proceedings of the AAAI Conference on Artificial Intelligence 30 (1), 2016 | 59 | 2016 |
Manifold-based multi-objective policy search with sample reuse S Parisi, M Pirotta, J Peters Neurocomputing 263, 3-14, 2017 | 55 | 2017 |
Near optimal exploration-exploitation in non-communicating markov decision processes R Fruit, M Pirotta, A Lazaric Advances in Neural Information Processing Systems 31, 2018 | 49 | 2018 |
Regret bounds for kernel-based reinforcement learning OD Domingues, P Ménard, M Pirotta, E Kaufmann, M Valko International Conference on Machine Learning, 2020 | 48* | 2020 |
Boosted fitted q-iteration S Tosatto, M Pirotta, C d’Eramo, M Restelli International Conference on Machine Learning, 3434-3443, 2017 | 48 | 2017 |
An asymptotically optimal primal-dual incremental algorithm for contextual linear bandits A Tirinzoni, M Pirotta, M Restelli, A Lazaric Advances in Neural Information Processing Systems 33, 1417-1427, 2020 | 47 | 2020 |
Adaptive batch size for safe policy gradients M Papini, M Pirotta, M Restelli Advances in neural information processing systems 30, 2017 | 47 | 2017 |
Exploration bonus for regret minimization in discrete and continuous average reward mdps J Qian, R Fruit, M Pirotta, A Lazaric Advances in Neural Information Processing Systems 32, 2019 | 44* | 2019 |