Close

Presentation

Specialized Kernels for Optimizing GPU Offload in OpenMP
DescriptionProgramming models for general purpose GPU (GPGPU) computing include grid and non-grid languages. Grid languages like CUDA and HIP map directly to the GPU hardware and can extract high performance from applications. However, this low-level programming approach makes them more difficult to program than non-grid languages such as C, C++, and Fortran with OpenMP target offload. Furthermore, grid languages often have more portability issues than non-grid languages. However, code generated from non-grid languages using automatic compiler and runtime techniques often incur higher overhead while generating GPU kernels.

This presentation discusses compiler and runtime techniques to generate specialized, high-performance kernels for OpenMP target regions in certain common situations. We outline conditions under which specialized kernels are generated for OpenMP target regions, both with and without reduction clauses. Experimental results on AMD GPUs indicate that a large percentage of OpenMP target regions are amenable to specialization and consequent performance improvement.
Event Type
Workshop
TimeMonday, 13 November 20239:40am - 10am MST
Location507
Tags
Accelerators
Compilers
Heterogeneous Computing
Performance Optimization
Programming Frameworks and System Software
Runtime Systems
Registration Categories
W