Yi Yang
Yi Yang
Verified email at nec-labs.com - Homepage
TitleCited byYear
A gpgpu compiler for memory optimization and parallelism management
Y Yang, P Xiang, J Kong, H Zhou
ACM SIGPLAN Notices 45 (6), 86-97, 2010
3312010
CPU-Assisted GPGPU on Fused CPU-GPU Architectures
Y Yang, P Xiang, M Mantor, H Zhou
1022012
CUDA-NP: realizing nested thread-level parallelism in GPGPU applications
Y Yang, H Zhou
ACM SIGPLAN Notices 49 (8), 93-106, 2014
622014
Shared Memory Multiplexing: A Novel Way to Improve GPGPU Throughput
Y Yang, P Xiang, M Mantor, N Rubin, H Zhou
Proceedings of the 21st international conference on Parallel architectures …, 2012
59*2012
Warp-level divergence in GPUs: Characterization, impact, and mitigation
P Xiang, Y Yang, H Zhou
2014 IEEE 20th International Symposium on High Performance Computer …, 2014
552014
Accelerating MATLAB image processing toolbox functions on GPUs
J Kong, M Dimitrov, Y Yang, J Liyanage, L Cao, J Staples, M Mantor, ...
Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics …, 2010
492010
Locality principle revisited: A probability-based quantitative approach
S Gupta, P Xiang, Y Yang, H Zhou
Journal of Parallel and Distributed Computing, 2013
462013
Locality Principle Revisited: A Probability-Based Quantitative Approach
S Gupta, P Xiang, Y Yang, H Zhou
IEEE International Parallel & Distributed Processing Symposium, 995 - 1009, 2012
462012
Optimizing memory efficiency for deep convolutional neural networks on GPUs
C Li, Y Yang, M Feng, S Chakradhar, H Zhou
SC'16: Proceedings of the International Conference for High Performance …, 2016
442016
A unified optimizing compiler framework for different GPGPU architectures
Y Yang, P Xiang, J Kong, M Mantor, H Zhou
ACM Transactions on Architecture and Code Optimization (TACO) 9 (2), 9, 2012
372012
Accelerating deep neural network training with inconsistent stochastic gradient descent
L Wang, Y Yang, R Min, S Chakradhar
Neural Networks 93, 219-229, 2017
312017
Understanding the tradeoffs between software-managed vs. hardware-managed caches in GPUs
C Li, Y Yang, H Dai, S Yan, F Mueller, H Zhou
2014 IEEE International Symposium on Performance Analysis of Systems and …, 2014
302014
Automatic data placement into GPU on-chip memory resources
C Li, Y Yang, Z Lin, H Zhou
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code …, 2015
292015
Exploiting uniform vector instructions for GPGPU performance, energy efficiency, and opportunistic reliability enhancement
P Xiang, Y Yang, M Mantor, N Rubin, LR Hsu, H Zhou, M Mantor, N Rubin
ICS, 433-442, 2013
272013
Fixing Performance Bugs: An Empirical Study of Open-Source GPGPU Programs
Y Yang, P Xiang, M Mantor, H Zhou
International Conference on Parallel Processing, 2012
182012
Apricot: an optimizing compiler and productivity tool for x86-compatible many-core coprocessors
N Ravi, Y Yang, T Bao, S Chakradhar
Proceedings of the 26th ACM international conference on Supercomputing, 47-58, 2012
172012
An optimizing compiler for GPGPU programs with input-data sharing
Y Yang, P Xiang, J Kong, H Zhou
ACM Sigplan Notices 45 (5), 343-344, 2010
172010
A case for a flexible scalar unit in SIMT architecture
Y Yang, P Xiang, M Mantor, N Rubin, L Hsu, Q Dong, H Zhou
2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014
132014
Cuda-np: Realizing nested thread-level parallelism in gpgpu applications
Y Yang, C Li, H Zhou
Journal of Computer Science and Technology 30 (1), 3-19, 2015
102015
Comp: Compiler optimizations for manycore processors
L Song, M Feng, N Ravi, Y Yang, S Chakradhar
Proceedings of the 47th Annual IEEE/ACM International Symposium on …, 2014
102014
The system can't perform the operation now. Try again later.
Articles 1–20