C-pack: Packaged resources to advance general chinese embedding S Xiao, Z Liu, P Zhang, N Muennighof arXiv preprint arXiv:2309.07597, 2023 | 334 | 2023 |
Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation J Chen, S Xiao, P Zhang, K Luo, D Lian, Z Liu arXiv preprint arXiv:2402.03216, 2024 | 176 | 2024 |
Graphformers: Gnn-nested transformers for representation learning on textual graph J Yang, Z Liu, S Xiao, C Li, D Lian, S Agrawal, A Singh, G Sun, X Xie Advances in Neural Information Processing Systems 34, 28798-28810, 2021 | 146 | 2021 |
RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder S Xiao, Z Liu, Y Shao, Z Cao arXiv preprint arXiv:2205.12035, 2022 | 139* | 2022 |
Retrieve anything to augment large language models P Zhang, S Xiao, Z Liu, Z Dou, JY Nie arXiv preprint arXiv:2310.07554, 2023 | 76 | 2023 |
LECF: recommendation via learnable edge collaborative filtering S Xiao, Y Shao, Y Li, H Yin, Y Shen, B Cui Science China Information Sciences 65 (1), 112101, 2022 | 42 | 2022 |
Training large-scale news recommenders with pretrained language models in the loop S Xiao, Z Liu, Y Shao, T Di, B Middha, F Wu, X Xie Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 39 | 2022 |
Long Context Compression with Activation Beacon P Zhang, Z Liu, S Xiao, N Shao, Q Ye, Z Dou arXiv preprint arXiv:2401.03462, 2024 | 37 | 2024 |
Making large language models a better foundation for dense retrieval C Li, Z Liu, S Xiao, Y Shao arXiv preprint arXiv:2312.15503, 2023 | 26* | 2023 |
Uni-retriever: Towards learning the unified embedding based retriever in bing sponsored search J Zhang, Z Liu, W Han, S Xiao, R Zheng, Y Shao, H Sun, H Zhu, ... Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 25 | 2022 |
MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding J Zhou, Y Shu, B Zhao, B Wu, S Xiao, X Yang, Y Xiong, B Zhang, T Huang, ... arXiv preprint arXiv:2406.04264, 2024 | 23 | 2024 |
Distill-vq: Learning retrieval oriented vector quantization by distilling knowledge from dense embeddings S Xiao, Z Liu, W Han, J Zhang, D Lian, Y Gong, Q Chen, F Yang, H Sun, ... Proceedings of the 45th International ACM SIGIR Conference on Research and …, 2022 | 23 | 2022 |
RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models Z Liu, S Xiao, Y Shao, Z Cao Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 22 | 2023 |
Matching-oriented Product Quantization For Ad-hoc Retrieval S Xiao, Z Liu, Y Shao, D Lian, X Xie EMNLP, 2021 | 19 | 2021 |
Progressively optimized bi-granular document representation for scalable embedding based retrieval S Xiao, Z Liu, W Han, J Zhang, Y Shao, D Lian, C Li, H Sun, D Deng, ... Proceedings of the ACM Web Conference 2022, 286-296, 2022 | 14 | 2022 |
Lm-cocktail: Resilient tuning of language models via model merging S Xiao, Z Liu, P Zhang, X Xing arXiv preprint arXiv:2311.13534, 2023 | 13 | 2023 |
Omnigen: Unified image generation S Xiao, Y Wang, J Zhou, H Yuan, X Xing, R Yan, S Wang, T Huang, Z Liu arXiv preprint arXiv:2409.11340, 2024 | 11 | 2024 |
Mindsim: user simulator for news recommenders X Luo, Z Liu, S Xiao, X Xie, D Li Proceedings of the ACM Web Conference 2022, 2067-2077, 2022 | 11 | 2022 |
BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models K Luo, Z Liu, S Xiao, K Liu arXiv preprint arXiv:2402.11573, 2024 | 10 | 2024 |
Extending Llama-3's Context Ten-Fold Overnight P Zhang, N Shao, Z Liu, S Xiao, H Qian, Q Ye, Z Dou arXiv preprint arXiv:2404.19553, 2024 | 5 | 2024 |