BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191657Z
LOCATION:402
DTSTART;TZID=America/Denver:20231113T083000
DTEND;TZID=America/Denver:20231113T170000
UID:submissions.supercomputing.org_SC23_sess246_tut117@linklings.com
SUMMARY:Hands-On Practical Hybrid Parallel Application Performance Enginee
 ring
DESCRIPTION:Tutorial\n\nMarkus Geimer (Forschungszentrum Jülich, Jülich Su
 percomputing Centre (JSC)); Sameer Shende (University of Oregon; ParaTools
 , Inc.); Bert Wesarg (GWT-TUD GmbH; Department for Information Services an
 d High Performance Computing (ZIH), Center for Interdisciplinary Digital S
 ciences (CIDS), Technische Universität Dresden); and Brian Wylie (Forschun
 gszentrum Jülich, Jülich Supercomputing Centre (JSC))\n\nThis tutorial pre
 sents state-of-the-art performance tools for leading-edge HPC systems foun
 ded on the community-developed Score-P instrumentation and measurement inf
 rastructure, demonstrating how they can be used for performance engineerin
 g of effective scientific applications based on standard MPI, OpenMP, hybr
 id combination of both, and increasingly common usage of accelerators. Par
 allel performance tools from the Virtual Institute – High Productivity Sup
 ercomputing (VI-HPS) are introduced and featured in hands-on exercises wit
 h Score-P, Scalasca, Vampir, and TAU. We present the complete workflow of 
 performance engineering, including instrumentation, measurement (profiling
  and tracing, timing and PAPI hardware counters), data storage, analysis, 
 tuning, and visualization. Emphasis is placed on how tools are used in com
 bination for identifying performance problems and investigating optimizati
 on alternatives. Using their own notebook computers, participants will con
 duct exercises on a contemporary HPC system where remote access will be pr
 ovided for the hands-on sessions through AWS running an E4S [http://e4s.io
 ] image containing all of the necessary tools. This image supports NVIDIA 
 GPUs using CUDA 12 and Python. This will help to prepare participants to l
 ocate and diagnose performance bottlenecks in their own parallel programs.
 \n\nTag: Accelerators, Applications, Heterogeneous Computing, Performance 
 Optimization\n\nRegistration Category: Tutorial Reg Pass
END:VEVENT
END:VCALENDAR
