Csaba Szepesvari

引用次数

	总计	2019 年至今
引用	35860	24189
h 指数	81	72
i10 指数	247	194

4900

2450

1225

3675

2003200420052006200720082009201020112012201320142015201620172018201920202021202220232024114 97 131 96 216 325 381 532 770 838 928 1115 1149 1365 1305 1746 2434 3391 4276 4677 4884 4396

开放获取的出版物数量

查看全部

75 篇文章

0 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Tor LattimoreDeepMind在 google.com 的电子邮件经过验证
Yasin Abbasi YadkoriGoogle DeepMind在 google.com 的电子邮件经过验证
Rémi MunosGoogle DeepMind在 inria.fr 的电子邮件经过验证
Branislav KvetonAdobe Research在 adobe.com 的电子邮件经过验证
Dale SchuurmansUniversity of Alberta, Google DeepMind在 cs.ualberta.ca 的电子邮件经过验证
Kocsis LeventeMTA SZTAKI在 sztaki.hu 的电子邮件经过验证
Richard S. SuttonKeen, Amii, and University of Alberta在 richsutton.com 的电子邮件经过验证
Dávid PálStaff Machine Learning Engineer, Instacart在 instacart.com 的电子邮件经过验证
Mohammad GhavamzadehAmazon在 amazon.com 的电子邮件经过验证
András AntosBudapest University of Technology and Economics在 cs.bme.hu 的电子邮件经过验证
Amir-massoud FarahmandUniversity of Toronto在 cs.toronto.edu 的电子邮件经过验证
Zheng WenGoogle DeepMind在 google.com 的电子邮件经过验证
Shalabh BhatnagarProfessor in the Department of Computer Science and Automation, Indian Institute of Science在 iisc.ac.in 的电子邮件经过验证
Lorincz, AndrasEotvos Lorand University在 inf.elte.hu 的电子邮件经过验证
Hamid MaeiNetflix在 netflix.com 的电子邮件经过验证
Mengdi WangCenter for Statistics & Machine Learning, ECE, Princeton University在 princeton.edu 的电子邮件经过验证
Nevena LazicDeepMind在 google.com 的电子邮件经过验证
Michael LittmanBrown University在 brown.edu 的电子邮件经过验证
Jincheng MeiResearch Scientist, Google DeepMind在 google.com 的电子邮件经过验证
Doina PrecupDeepMind and McGill University在 cs.mcgill.ca 的电子邮件经过验证

关注

Csaba Szepesvari

DeepMind & University of Alberta

在 cs.ualberta.ca 的电子邮件经过验证 - 首页

machine learning learning theory online learning reinforcement learning Markov Decision Processes


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Bandit based monte-carlo planning L Kocsis, C Szepesvári European conference on machine learning, 282-293, 2006	4438	2006
Bandit algorithms T Lattimore, C Szepesvári Cambridge University Press, 2020	3066	2020
Algorithms for Reinforcement Learning C Szepesvari Morgan and Claypool, 2010	2177*	2010
Improved algorithms for linear stochastic bandits Y Abbasi-Yadkori, C Szepesvári, D Pál Advances in Neural Information Processing Systems, 2312-2320, 2011	2068	2011
Convergence results for single-step on-policy reinforcement-learning algorithms S Singh, T Jaakkola, ML Littman, C Szepesvári Machine learning 38, 287-308, 2000	1032	2000
Exploration–exploitation tradeoff using variance estimates in multi-armed bandits JY Audibert, R Munos, C Szepesvári Theoretical Computer Science 410 (19), 1876-1902, 2009	792	2009
Fast gradient-descent methods for temporal-difference learning with linear function approximation RS Sutton, HR Maei, D Precup, S Bhatnagar, D Silver, C Szepesvári, ... Proceedings of the 26th annual international conference on machine learning …, 2009	728	2009
Finite-Time Bounds for Fitted Value Iteration. R Munos, C Szepesvári Journal of Machine Learning Research 9 (5), 2008	648	2008
Parametric bandits: The generalized linear case S Filippi, O Cappe, A Garivier, C Szepesvári Advances in neural information processing systems 23, 2010	556	2010
X-Armed Bandits. S Bubeck, R Munos, G Stoltz, C Szepesvári Journal of Machine Learning Research 12 (5), 2011	511	2011
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path A Antos, C Szepesvári, R Munos Machine Learning 71, 89-129, 2008	506	2008
Learning with a strong adversary R Huang, B Xu, D Schuurmans, C Szepesvári arXiv preprint arXiv:1511.03034, 2015	454	2015
Regret bounds for the adaptive control of linear quadratic systems Y Abbasi-Yadkori, C Szepesvári Proceedings of the 24th Annual Conference on Learning Theory, 1-26, 2011	443	2011
Convergent temporal-difference learning with arbitrary smooth function approximation H Maei, C Szepesvari, S Bhatnagar, D Precup, D Silver, RS Sutton Advances in neural information processing systems 22, 2009	353	2009
A generalized reinforcement-learning model: Convergence and applications ML Littman, C Szepesvári ICML 96, 310-318, 1996	353	1996
Toward off-policy learning control with function approximation. HR Maei, C Szepesvári, S Bhatnagar, RS Sutton ICML 10, 719-726, 2010	343	2010
Tight regret bounds for stochastic combinatorial semi-bandits B Kveton, Z Wen, A Ashkan, C Szepesvari Artificial Intelligence and Statistics, 535-543, 2015	334	2015
Model-based reinforcement learning with value-targeted regression A Ayoub, Z Jia, C Szepesvari, M Wang, L Yang International Conference on Machine Learning, 463-474, 2020	331	2020
Online learning under delayed feedback P Joulani, A Gyorgy, C Szepesvári International conference on machine learning, 1453-1461, 2013	328	2013
The grand challenge of computer Go: Monte Carlo tree search and extensions S Gelly, L Kocsis, M Schoenauer, M Sebag, D Silver, C Szepesvári, ... Communications of the ACM 55 (3), 106-113, 2012	324	2012

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

关注

引用次数

合著作者