BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191658Z
LOCATION:710
DTSTART;TZID=America/Denver:20231113T111000
DTEND;TZID=America/Denver:20231113T113000
UID:submissions.supercomputing.org_SC23_sess445_ws_scalah101@linklings.com
SUMMARY:GPU-Based LU Factorization and Solve on Batches of Matrices with B
 and Structure
DESCRIPTION:Workshop\n\nAhmad Abdelfattah, Stanimire Tomov, Piotr Luszczek
 , Hartwig Anzt, and Jack Dongarra (University of Tennessee)\n\nThis paper 
 presents a portable and performance-efficient approach to solve a batch of
  linear systems of equations using Graphics Processing Units (GPUs). Each 
 system is represented using a special type of matrices with a band structu
 re above and/or below the diagonal. Each matrix is factorized using an LU 
 factorization with partial pivoting for numerical stability. Subsequently,
  the factors are used to find the solution for as many right hand sides as
  needed. The width of the band is often small enough that performing a ful
 ly dense LU factorization results in poor performance. We follow the stand
 ard LAPACK specifications for addressing this type of problems and develop
  a dedicated solver that runs efficiently on GPUs. No similar solver is cu
 rrently available in the vendor's software stack, so performance results a
 re shown on both NVIDIA and AMD GPUs relative to a parallel CPU solution u
 tilizing OpenMP for thread-level parallelization.\n\nTag: Algorithms, Hete
 rogeneous Computing, Large Scale Systems\n\nRegistration Category: Worksho
 p Reg Pass\n\nSession Chairs: Vassil Alexandrov (Hartree Centre); Jack Don
 garra (University of Tennessee, Oak Ridge National Laboratory (ORNL)); Chr
 istian Engelmann (Oak Ridge National Laboratory (ORNL)); Al Geist (Oak Rid
 ge National Laboratory (ORNL)); and Dieter A. Kranzlmueller (Ludwig-Maxmil
 ians-Universität München, Leibniz Supercomputing Centre)
END:VEVENT
END:VCALENDAR
