Riformer: Keep your vision backbone effective but removing token mixer J Wang, S Zhang, Y Liu, T Wu, Y Yang, X Liu, K Chen, P Luo, D Lin Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 19 | 2023 |
Tencentpretrain: A scalable and flexible toolkit for pre-training models of different modalities Z Zhao, Y Li, C Hou, J Zhao, R Tian, W Liu, Y Chen, N Sun, H Liu, W Mao, ... arXiv preprint arXiv:2212.06385, 2022 | 18 | 2022 |
Syngen: A syntactic plug-and-play module for generative aspect-based sentiment analysis C Yu, T Wu, J Li, X Bai, Y Yang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 12 | 2023 |
Modeling fine-grained information via knowledge-aware hierarchical graph for zero-shot entity retrieval T Wu, X Bai, W Guo, W Liu, S Li, Y Yang Proceedings of the Sixteenth ACM International Conference on Web Search and …, 2023 | 11 | 2023 |
Edge-free but structure-aware: Prototype-guided knowledge distillation from gnns to mlps T Wu, Z Zhao, J Wang, X Bai, L Wang, N Wong, Y Yang arXiv preprint arXiv:2303.13763, 2023 | 7 | 2023 |
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models T Wu, C Tao, J Wang, Z Zhao, N Wong arXiv preprint arXiv:2404.02657, 2024 | 6 | 2024 |
Weight-inherited distillation for task-agnostic bert compression T Wu, C Hou, S Lao, J Li, N Wong, Z Zhao, Y Yang arXiv preprint arXiv:2305.09098, 2023 | 5 | 2023 |
Prompt-based Model for Acronym Disambiguation via Negative Sampling T Wu, X Bai, Y Yang AAAI 2022 workshop SDU@2022, 2022 | 3 | 2022 |
Riformer: Keep your vision backbone effective while removing token mixer J Wang, S Zhang, Y Liu, T Wu, Y Yang, X Liu, K Chen, P Luo, D Lin arXiv preprint arXiv:2304.05659, 2023 | 2 | 2023 |
Mixture-of-Subspaces in Low-Rank Adaptation T Wu, J Wang, Z Zhao, N Wong arXiv preprint arXiv:2406.11909, 2024 | 1 | 2024 |
Multi-stage Distillation Framework for Cross-Lingual Semantic Similarity Matching K Ding, W Liu, Y Fang, Z Zhao, Q Ju, X Yang, R Tian, Z Tao, H Liu, H Guo, ... NAACL 2022 Findings, 2022 | 1 | 2022 |
LoCa: Logit Calibration for Knowledge Distillation R Yang, T Wu, Y Yang arXiv preprint arXiv:2409.04778, 2024 | | 2024 |
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast C Shi, C Yang, X Zhu, J Wang, T Wu, S Li, D Cai, Y Yang, Y Meng arXiv preprint arXiv:2405.14507, 2024 | | 2024 |
Adapting LLaMA Decoder to Vision Transformer J Wang, W Shao, M Chen, C Wu, Y Liu, T Wu, K Zhang, S Zhang, K Chen, ... arXiv preprint arXiv:2404.06773, 2024 | | 2024 |
Recouple Event Field via Probabilistic Bias for Event Extraction X Bai, T Wu, H Guo, Z Zhao, X Yang, J Li, W Liu, Q Ju, W Guo, Y Yang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |
Overview of the NLPCC 2021 Shared Task: AutoIE2 W Guo, X Yang, X Bai, T Wu, W Liu, Z Zhao, Q Ju, Y Yang Natural Language Processing and Chinese Computing: 10th CCF International …, 2021 | | 2021 |