Portable GPU Acceleration of HPC Applications with Standard C++
DescriptionThis hands-on tutorial teaches how to parallelize and optimize HPC applications for multi-core CPUs and GPUs using the portable parallelism and concurrency features of the ISO C++23 standard without any language or vendor extensions. We further show how to integrate this approach with MPI to target large multi-node homogeneous and heterogeneous HPC systems. The attendees learn problem-solving strategies for parallelizing classic HPC patterns (multi-dimensional loops, map-reduce, scans) and concurrency problems, e.g., to hide the latency of MPI communication behind computation. The tutorial provides attendees zero-setup web access to Jupyter Lab running on modern multi-GPU accelerated systems, enabling attendees to solve the hands-on exercises directly in their web browser. These hands-on exercises transfer the above mentioned technique to produce a portable multi-node, heterogeneous, and asynchronous 2D unsteady heat-equation mini-application. Finally, we synthesize practical techniques acquired from our professional experience applying the portable ISO C++23 parallel and asynchronous programming models to port large real-world HPC applications to heterogeneous supercomputers and refer further learning resources.
Event Type
TimeSunday, 12 November 20238:30am - 12pm MST
Software Engineering
Registration Categories