Sauté rl: Almost surely safe reinforcement learning using state augmentation A Sootla, AI Cowen-Rivers, T Jafferjee, Z Wang, DH Mguni, J Wang, ... ICML 2022, 2022 | 68 | 2022 |
Multi-agent constrained policy optimisation S Gu, JG Kuba, M Wen, R Chen, Z Wang, Z Tian, J Wang, A Knoll, Y Yang arXiv preprint arXiv:2110.02793, 2021 | 54 | 2021 |
ChessGPT: Bridging Policy Learning and Language Modeling X Feng, Y Luo, Z Wang, H Tang, M Yang, K Shao, D Mguni, Y Du, J Wang NeurIPS 2023, 2023 | 32 | 2023 |
Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach Y Zhang, Y Du, B Huang, Z Wang, J Wang, M Fang, M Pechenizkiy NeurIPS 2023, 2023 | 12 | 2023 |
DESTA: A framework for safe reinforcement learning with markov games of intervention D Mguni, U Islam, Y Sun, X Zhang, J Jennings, A Sootla, C Yu, Z Wang, ... arXiv preprint arXiv:2110.14468, 2021 | 5 | 2021 |
Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models X Lou, J Zhang, Z Wang, K Huang, Y Du AAMAS 2024, 2024 | 3 | 2024 |
Natural language reinforcement learning X Feng, Z Wan, M Yang, Z Wang, GA Koushik, Y Du, Y Wen, J Wang arXiv preprint arXiv:2411.14251, 2024 | 2 | 2024 |
Safe Multi-agent Reinforcement Learning with Natural Language Constraints Z Wang, M Fang, T Tomilin, F Fang, Y Du ICLR 2024 GenAI4DM Workshop, 2024 | 2 | 2024 |
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf X Jin*, Z Wang*, Y Du, M Fang, H Zhang, J Wang NeurIPS 2024, 2024 | 2 | 2024 |
MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment Z Wang, Y Du, Y Zhang, M Fang, B Huang NeurIPS 2024 CRL Workshop, 2023 | 2 | 2023 |
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting XH Chen*, Z Wang*, Y Du, S Jiang, M Fang, Y Yu, J Wang NeurIPS 2024 (Oral), 2024 | 1 | 2024 |