Follow
Xuechao Wei
Xuechao Wei
Verified email at pku.edu.cn
Title
Cited by
Cited by
Year
Automated systolic array architecture synthesis for high throughput CNN inference on FPGAs
X Wei, CH Yu, P Zhang, Y Chen, Y Wang, H Hu, Y Liang, J Cong
Proceedings of the 54th Annual Design Automation Conference 2017, 1-6, 2017
4452017
Overcoming data transfer bottlenecks in FPGA-based DNN accelerators via layer conscious memory management
X Wei, Y Liang, J Cong
Proceedings of the 56th Annual Design Automation Conference 2019, 1-6, 2019
742019
TGPA: Tile-grained pipeline architecture for low latency CNN inference
X Wei, Y Liang, X Li, CH Yu, P Zhang, J Cong
2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 1-8, 2018
722018
Frequency improvement of systolic array-based CNNs on FPGAs
J Zhang, W Zhang, G Luo, X Wei, Y Liang, J Cong
2019 IEEE International Symposium on Circuits and Systems (ISCAS), 1-4, 2019
382019
Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems
X Wei, Y Liang, T Wang, S Lu, J Cong
2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), 488-493, 2017
312017
Generating systolic array accelerators with reusable blocks
L Jia, L Lu, X Wei, Y Liang
IEEE Micro 40 (4), 85-92, 2020
202020
{PetS}: A Unified Framework for {Parameter-Efficient} Transformers Serving
Z Zhou, X Wei, J Zhang, G Sun
2022 USENIX Annual Technical Conference (USENIX ATC 22), 489-504, 2022
192022
FlexBFS: a parallelism-aware implementation of breadth-first search on GPU
G Liu, H An, W Han, X Li, T Sun, W Zhou, X Wei, X Tang
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of …, 2012
172012
FTDL: a tailored FPGA-overlay for deep learning with high scalability
R Shi, Y Ding, X Wei, H Li, H Liu, HKH So, C Ding
2020 57th ACM/IEEE Design Automation Conference (DAC), 1-6, 2020
112020
Gcnear: A hybrid architecture for efficient gcn training with near-memory processing
Z Zhou, C Li, X Wei, G Sun
arXiv preprint arXiv:2111.00680, 1-15, 2021
102021
Gnnear: Accelerating full-batch training of graph neural networks with near-memory processing
Z Zhou, C Li, X Wei, X Wang, G Sun
Proceedings of the International Conference on Parallel Architectures and …, 2022
82022
FTDL: An FPGA-tailored Architecture for Deep Learning Systems.
R Shi, Y Ding, X Wei, H Liu, HKH So, C Ding
FPGA, 320, 2020
52020
Distributed Control Independence for Composable Multi-processors
M Mao, H An, T Sun, Q Li, B Deng, X Wei, J Zhou
2012 IEEE/ACIS 11th International Conference on Computer and Information …, 2012
32012
ArchExplorer: Microarchitecture exploration via bottleneck analysis
C Bai, J Huang, X Wei, Y Ma, S Li, H Zheng, B Yu, Y Xie
Proceedings of the 56th Annual IEEE/ACM International Symposium on …, 2023
22023
Efficient Super-Resolution System with Block-wise Hybridization and Quantized Winograd on FPGA
B Shi, J Zhang, Z He, X Wei, S Li, G Luo, H Zheng, Y Xie
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2023
22023
2022 ICCAD CAD Contest Problem C: Microarchitecture Design Space Exploration
S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided …, 2022
22022
An Intermediate-Centric Dataflow for Transposed Convolution Acceleration on FPGA
Z Ma, T Dai, X Wei, G Luo
ACM Transactions on Embedded Computing Systems 22 (6), 1-22, 2023
12023
Iccad cad contest 2022
S Li, C Bai, X Wei, B Shi, YK Chen, Y Xie
12022
Batch computing system and associated method
X Wei, Z Zhou, J Zhang, S Li, YK Chen, B Shi
US Patent App. 18/323,086, 2024
2024
Klotski: DNN Model Orchestration Framework for Dataflow Architecture Accelerators
C Bai, X Wei, Y Zhuo, Y Cai, H Zheng, B Yu, Y Xie
2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 1-9, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20