Follow
Canzhe Zhao (赵灿哲)
Canzhe Zhao (赵灿哲)
Verified email at sjtu.edu.cn - Homepage
Title
Cited by
Cited by
Year
Comparison-based conversational recommender system with relative bandit feedback
Z Xie, T Yu, C Zhao, S Li
Proceedings of the 44th International ACM SIGIR Conference on Research and …, 2021
312021
Knowledge-aware conversational preference elicitation with bandit feedback
C Zhao, T Yu, Z Xie, S Li
Proceedings of the ACM Web Conference 2022, 483-492, 2022
152022
Clustering of conversational bandits for user preference learning and elicitation
J Wu, C Zhao, T Yu, J Li, S Li
Proceedings of the 30th ACM International Conference on Information …, 2021
152021
Best-of-three-worlds analysis for linear bandits with follow-the-regularized-leader algorithm
F Kong, C Zhao, S Li
The Thirty Sixth Annual Conference on Learning Theory, 657-673, 2023
82023
Learning adversarial linear mixture markov decision processes with bandit feedback and unknown transition
C Zhao, R Yang, B Wang, S Li
The Eleventh International Conference on Learning Representations, 2022
62022
Simultaneously learning stochastic and adversarial bandits under the position-based model
C Chen, C Zhao, S Li
Proceedings of the AAAI Conference on Artificial Intelligence 36 (6), 6202-6210, 2022
42022
Conservative contextual combinatorial cascading bandit
K Wang
IEEE Access 9, 151434-151443, 2021
42021
Learning adversarial low-rank markov decision processes with unknown transition and full-information feedback
C Zhao, R Yang, B Wang, X Zhang, S Li
Advances in Neural Information Processing Systems 36, 2024
22024
Clustering of conversational bandits with posterior sampling for user preference learning and elicitation
Q Li, C Zhao, T Yu, J Wu, S Li
User Modeling and User-Adapted Interaction 33 (5), 1065-1112, 2023
22023
Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
C Zhao, Y Ze, J Dong, B Wang, S Li
Proceedings of the Sixteenth ACM International Conference on Web Search and …, 2023
12023
Toward joint utilization of absolute and relative bandit feedback for conversational recommendation
Y Xia, Z Xie, T Yu, C Zhao, S Li
User Modeling and User-Adapted Interaction, 1-38, 2024
2024
Towards Provably Efficient Learning of Extensive-Form Games with Imperfect Information and Linear Function Approximation
C Zhao, S Chen, W Liu, H Fu, Q FU, S Li
2023
DPMAC: differentially private communication for cooperative multi-agent reinforcement learning
C Zhao, Y Ze, J Dong, B Wang, S Li
arXiv preprint arXiv:2308.09902, 2023
2023
Learning Adversarial Low-rank MDPs with Unknown Transition and Full-information Feedback
C Zhao, R Yang, B Wang, X Zhang, S Li
The system can't perform the operation now. Try again later.
Articles 1–14