Shizhe Chen

引用次数

	总计	2019 年至今
引用	3164	2944
h 指数	26	26
i10 指数	48	48

820

410

205

615

20162017201820192020202120222023202422 59 129 190 221 344 549 817 799

开放获取的出版物数量

查看全部

40 篇文章

10 篇文章

可查看的文章

无法查看的文章

根据资助方的强制性开放获取政策

合著作者

Qin Jin中国人民大学信息学院在 ruc.edu.cn 的电子邮件经过验证
Cordelia SchmidResearch director INRIA 在 inria.fr 的电子邮件经过验证
Ivan LaptevProfessor at MBZUAI, on leave from INRIA在 inria.fr 的电子邮件经过验证
Alex HauptmannCarnegie Mellon University在 cs.cmu.edu 的电子邮件经过验证
Ruihua SongRenmin University of China在 ruc.edu.cn 的电子邮件经过验证

关注

Shizhe Chen

INRIA Paris

在 inria.fr 的电子邮件经过验证 - 首页

Computer Vision Vision-and-Language


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Fine-grained video-text retrieval with hierarchical graph reasoning S Chen, Y Zhao, Q Jin, Q Wu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	352	2020
Say as you wish: Fine-grained control of image caption generation with abstract scene graphs S Chen, Q Jin, P Wang, Q Wu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020	257	2020
Speech emotion recognition with acoustic and lexical features Q Jin, C Li, S Chen, H Wu 2015 IEEE international conference on acoustics, speech and signal …, 2015	217	2015
History aware multimodal transformer for vision-and-language navigation S Chen, PL Guhur, C Schmid, I Laptev Advances in neural information processing systems 34, 5834-5847, 2021	205	2021
Multimodal multi-task learning for dimensional and continuous emotion recognition S Chen, Q Jin, J Zhao, S Wang Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, 19-26, 2017	168	2017
Multi-modal dimensional emotion recognition using recurrent neural networks S Chen, Q Jin Proceedings of the 5th International Workshop on Audio/Visual Emotion …, 2015	145	2015
Airbert: In-domain pretraining for vision-and-language navigation PL Guhur, M Tapaswi, S Chen, I Laptev, C Schmid Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	139	2021
Think global, act local: Dual-scale graph transformer for vision-and-language navigation S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	132	2022
WenLan: Bridging vision and language by large-scale multi-modal pre-training Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ... arXiv preprint arXiv:2103.06561, 2021	132	2021
Describing videos using multi-modal fusion Q Jin, J Chen, S Chen, Y Xiong, A Hauptmann Proceedings of the 24th ACM international conference on Multimedia, 1087-1091, 2016	119	2016
Elaborative rehearsal for zero-shot action recognition S Chen, D Huang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021	106	2021
Instruction-driven history-aware policies for robotic manipulations PL Guhur, S Chen, RG Pinel, M Tapaswi, I Laptev, C Schmid Conference on Robot Learning, 175-187, 2023	91	2023
Multi-modal conditional attention fusion for dimensional emotion prediction S Chen, Q Jin Proceedings of the 24th ACM international conference on Multimedia, 571-575, 2016	81	2016
Video captioning with guidance of multimodal latent topics S Chen, J Chen, Q Jin, A Hauptmann Proceedings of the 25th ACM international conference on Multimedia, 1838-1846, 2017	74	2017
Sketch, ground, and refine: Top-down dense video captioning C Deng, S Chen, D Chen, Y He, Q Wu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	73	2021
Multi-modal multi-cultural dimensional continues emotion recognition in dyadic interactions J Zhao, R Li, S Chen, Q Jin Proceedings of the 2018 on audio/visual emotion challenge and workshop, 65-72, 2018	56	2018
Few-shot action recognition with hierarchical matching and contrastive learning S Zheng, S Chen, Q Jin European Conference on Computer Vision, 297-313, 2022	49	2022
Unpaired cross-lingual image caption generation with self-supervised rewards Y Song, S Chen, Y Zhao, Q Jin Proceedings of the 27th ACM international conference on multimedia, 784-792, 2019	45	2019
Language conditioned spatial relation reasoning for 3d object grounding S Chen, PL Guhur, M Tapaswi, C Schmid, I Laptev Advances in neural information processing systems 35, 20522-20535, 2022	44	2022
Generating Video Descriptions With Latent Topic Guidance S Chen, Q Jin, J Chen, A Hauptmann IEEE TRANSACTIONS ON MULTIMEDIA 21 (9), 2407-2418, 2019	43	2019

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

关注

引用次数

合著作者