BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191658Z
LOCATION:710
DTSTART;TZID=America/Denver:20231112T111800
DTEND;TZID=America/Denver:20231112T114200
UID:submissions.supercomputing.org_SC23_sess419_ws_prot106@linklings.com
SUMMARY:GPUscout: Locating Data Movement-Related Bottlenecks on GPUs
DESCRIPTION:Workshop\n\nSoumya Sen, Stepan Vanecek, and Martin Schulz (Tec
 hnical University of Munich)\n\nGPUs pose an attractive opportunity for de
 livering high-performance applications. However, GPU codes are often limit
 ed due to memory contention, resulting in overall performance degradation.
  Since GPU scheduling is transparent to the user, and GPU memory architect
 ures are very complex compared to ones on CPUs, finding such bottlenecks i
 s a very cumbersome process.\n\nIn this paper, we present a novel method o
 f systematically detecting the root cause of frequent memory performance b
 ottlenecks on NVIDIA GPUs that we call GPUscout. It connects three approac
 hes to analyzing performance - static CUDA SASS code analysis, sampling wa
 rp stalls, and kernel performance metrics. Connecting these approaches, GP
 Uscout can identify the problem, locate the code segment where it originat
 es, and assess its importance.\n\nThis paper illustrates the capabilities 
 and the design of our implementation of GPUscout. We show its applicabilit
 y based on three commonly-used kernels, yielding promising results in term
 s of accuracy, efficiency, and usability.\n\nTag: Performance Measurement,
  Modeling, and Tools, Programming Frameworks and System Software\n\nRegist
 ration Category: Workshop Reg Pass\n\nSession Chairs: David Boehme (Lawren
 ce Livermore National Laboratory (LLNL)); Anthony Danalis (University of T
 ennessee); and Josef Weidendorfer (Leibniz Supercomputing Centre, Technica
 l University of Munich)
END:VEVENT
END:VCALENDAR
