Coin: A large-scale dataset for comprehensive instructional video analysis Y Tang, D Ding, Y Rao, Y Zheng, D Zhang, L Zhao, J Lu, J Zhou Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 362 | 2019 |
Uncertainty-aware score distribution learning for action quality assessment Y Tang, Z Ni, J Zhou, D Zhang, J Lu, Y Wu, J Zhou Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 168 | 2020 |
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, TJ Hua, Z Cheng, D Shin, ... arXiv preprint arXiv:2404.07972, 2024 | 112 | 2024 |
Websrc: A dataset for web-based structural reading comprehension X Chen, Z Zhao, L Chen, D Zhang, J Ji, A Luo, Y Xiong, K Yu arXiv preprint arXiv:2101.09465, 2021 | 84 | 2021 |
Large language models are semi-parametric reinforcement learning agents D Zhang, L Chen, S Zhang, H Xu, Z Zhao, K Yu Advances in Neural Information Processing Systems 36, 78227-78239, 2023 | 45 | 2023 |
Rotation-robust intersection over union for 3d object detection Y Zheng, D Zhang, S Xie, J Lu, J Zhou European Conference on Computer Vision, 464-480, 2020 | 41 | 2020 |
Benchmarking multimodal agents for open-ended tasks in real computer environments T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, TJ Hua, Z Cheng, D Shin, ... URL http://arxiv. org/abs/2404.07972, 2024 | 33 | 2024 |
Learning from temporal spatial cubism for cross-dataset skeleton-based action recognition Y Tang, X Liu, X Yu, D Zhang, J Lu, J Zhou ACM Transactions on Multimedia Computing, Communications, and Applications …, 2022 | 22 | 2022 |
Spider2-v: How far are multimodal agents from automating data science and engineering workflows? R Cao, F Lei, H Wu, J Chen, Y Fu, H Gao, X Xiong, H Zhang, W Hu, Y Mao, ... Advances in Neural Information Processing Systems 37, 107703-107744, 2024 | 16 | 2024 |
Mobile-Env: A Universal Platform for Training and Evaluation of Mobile Interaction D Zhang, L Chen, K Yu arXiv preprint arXiv:2305.08144, 2023 | 13 | 2023 |
Mobile-env: an evaluation platform and benchmark for LLM-GUI interaction D Zhang, H Xu, Z Zhao, L Chen, R Cao, K Yu arXiv e-prints, arXiv: 2305.08144, 2023 | 3 | 2023 |
Mobile-Env: Building Qualified Evaluation Benchmarks for LLM-GUI Interaction D Zhang, Z Shen, R Xie, S Zhang, T Xie, Z Zhao, S Chen, L Chen, H Xu, ... arXiv preprint arXiv:2305.08144, 2023 | 1 | 2023 |