BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191702Z
LOCATION:501-502
DTSTART;TZID=America/Denver:20231112T110000
DTEND;TZID=America/Denver:20231112T111500
UID:submissions.supercomputing.org_SC23_sess416_ws_hppss101@linklings.com
SUMMARY:Maximizing Data Utility for HPC Python Workflow Execution
DESCRIPTION:Workshop\n\nThanh Son Phung (University of Notre Dame), Ben Cl
 ifford (CQX Limited), Kyle Chard (University of Chicago), and Douglas Thai
 n (University of Notre Dame)\n\nLarge-scale HPC workflows are increasingly
  implemented in dynamic languages such as Python, which allow for more rap
 id development than traditional techniques. However, the cost of executing
  Python applications at scale is often dominated by the distribution of co
 mmon datasets and complex software dependencies. As the application scales
  up, data distribution becomes a limiting factor that prevents scaling bey
 ond a few hundred nodes. To address this problem, we present the integrati
 on of Parsl (a Python-native parallel programming library) with TaskVine (
 a data-intensive workflow execution engine). Instead of relying on a share
 d filesystem to provide data to tasks on demand, Parsl is able to express 
 advance data needs to TaskVine, which then performs efficient data distrib
 ution at runtime. This combination provides a performance speedup of 1.48x
  over the typical method of on-demand paging from the shared filesystem, w
 hile also providing an average task speedup of 1.79x with 2048 tasks and 2
 56 nodes.\n\nTag: Applications, Distributed Computing, Large Scale Systems
 , Programming Frameworks and System Software, Runtime Systems\n\nRegistrat
 ion Category: Workshop Reg Pass\n\nSession Chairs: Sam Foreman (Argonne Na
 tional Laboratory (ANL)); Daniel Margala (Lawrence Berkeley National Labor
 atory (LBNL)); Pete Mendygral (Hewlett Packard Enterprise (HPE)); Laurie A
 . Stephey (Lawrence Berkeley National Laboratory (LBNL), National Energy R
 esearch Scientific Computing Center (NERSC)); and Rollin Thomas (Lawrence 
 Berkeley National Laboratory (LBNL), National Energy Research Scientific C
 omputing Center (NERSC))
END:VEVENT
END:VCALENDAR
