BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191657Z
LOCATION:505
DTSTART;TZID=America/Denver:20231112T162000
DTEND;TZID=America/Denver:20231112T165000
UID:submissions.supercomputing.org_SC23_sess428_misc262@linklings.com
SUMMARY:Enabling Large Dynamic Neural Network Training with Learning-Based
  Runtime Memory Management
DESCRIPTION:Workshop\n\nDong Li (University of California, Merced)\n\nDyna
 mic neural network (DyNN) enables high computational efficiency and strong
  representation capability. However, training DyNN can face a memory capac
 ity problem because of increasing model size or limited GPU memory capacit
 y. Managing tensors to save GPU memory is challenging, because of the dyna
 mic structure of DyNN. We introduce DyNN-Offload, a memory-management runt
 ime system to train DyNN. DyNN-Offload uses a learned approach (using a ne
 ural network called the pilot model) to increase predictability of tensor 
 accesses to facilitate memory management. The key of DyNN-Offload is to en
 able fast inference of the pilot model in order to reduce its performance 
 overhead, while providing high inference (or prediction) accuracy. DyNN-Of
 fload reduces input feature space and model complexity of the pilot model 
 based on a new representation of DyNN. DyNN-Offload enables 8× larger DyNN
  training on a single GPU compared with using PyTorch alone (unprecedented
  with any existing solution). Evaluating with AlphaFold (a production-leve
 l, large-scale DyNN), we show that DyNN-Offload outperforms unified virtua
 l memory (UVM) and dynamic tensor rematerialization (DTR), the most advanc
 ed solutions to save GPU memory for DyNN, by 3× and 2.1× respectively in t
 erms of maximum batch size.\n\nTag: Distributed Computing, Middleware and 
 System Software, Runtime Systems\n\nRegistration Category: Workshop Reg Pa
 ss\n\nSession Chairs: Barbara Chapman (Hewlett Packard Enterprise (HPE), S
 tony Brook University); Joseph Manzano (Pacific Northwest National Laborat
 ory (PNNL)); Shirley Moore (University of Texas, El Paso); EunJung (EJ) Pa
 rk (Qualcomm Inc); and Joshua Suetterlein (Pacific Northwest National Labo
 ratory (PNNL))
END:VEVENT
END:VCALENDAR
