关注
Shang Yang
标题
引用次数
引用次数
年份
Awq: Activation-aware weight quantization for llm compression and acceleration
J Lin, J Tang, H Tang, S Yang, X Dang, S Han
arXiv preprint arXiv:2306.00978, 2023
1542023
Flatformer: Flattened window attention for efficient point cloud transformer
Z Liu, X Yang, H Tang, S Yang, S Han
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
352023
Torchsparse++: Efficient training and inference framework for sparse convolution on gpus
H Tang, S Yang, Z Liu, K Hong, Z Yu, X Li, G Dai, Y Wang, S Han
Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023
11*2023
Heuristic adaptability to input dynamics for spmm on gpus
G Dai, G Huang, S Yang, Z Yu, H Zhang, Y Ding, Y Xie, H Yang, Y Wang
Proceedings of the 59th ACM/IEEE Design Automation Conference, 595-600, 2022
102022
Hypergef: A framework enabling efficient fusion for hypergraph neural network on gpus
Z Yu, G Dai, S Yang, G Zhang, H Zhang, F Zhu, J Yang, J Zhao, Y Wang
Proceedings of Machine Learning and Systems 5, 2023
32023
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Y Lin, H Tang, S Yang, Z Zhang, G Xiao, C Gan, S Han
arXiv preprint arXiv:2405.04532, 2024
2024
Sparse Refinement for Efficient High-Resolution Semantic Segmentation
Z Liu, Z Zhang, S Yang, H Tang, C Xu, K Keutzer, S Han
2023
CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory
T Fu, C Wei, Z Zhu, S Yang, Z Yu, G Dai, H Yang, Y Wang
2023 Design, Automation & Test in Europe Conference & Exhibition (DATE), 1-6, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–8