BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191659Z
LOCATION:704-706
DTSTART;TZID=America/Denver:20231113T102500
DTEND;TZID=America/Denver:20231113T104300
UID:submissions.supercomputing.org_SC23_sess450_ws_worksp105@linklings.com
SUMMARY:TaskVine: Managing In-Cluster Storage for High-Throughput Data Int
 ensive Workflows
DESCRIPTION:Workshop\n\nBarry Sly-Delgado, Thanh Son Phung, Colin Thomas, 
 David Simonetti, Andrew Hennessee, Ben Tovar, and Douglas Thain (Universit
 y of Notre Dame)\n\nMany scientific applications are expressed as high-thr
 oughput workflows that consist of large graphs of data assets and tasks to
  be executed on large parallel and distributed systems. A challenge in exe
 cuting these workflows is managing data: both datasets and software must b
 e efficiently distributed to cluster nodes; intermediate data must be conv
 eyed between tasks; output data must be delivered to its destination. Scal
 ing problems result when these actions are performed in an uncoordinated m
 anner on a shared filesystem. To address this problem, we introduce TaskVi
 ne: a system for exploiting the aggregate local storage and network capaci
 ty of a large cluster. TaskVine tracks the lifetime of data in a workflow 
 --from archival sources to final outputs-- making use of local storage to 
 distribute and re-use data. We describe the architecture and novel capabil
 ities of TaskVine, and demonstrate its use with applications in genomics, 
 high energy physics, molecular dynamics, and machine learning.\n\nTag: Dat
 a Analysis, Visualization, and Storage, Large Scale Systems, Programming F
 rameworks and System Software, Reproducibility, Resource Management, Runti
 me Systems\n\nRegistration Category: Workshop Reg Pass\n\nSession Chairs: 
 Silvina Caino-Lores (French Institute for Research in Computer Science and
  Automation (INRIA)) and Anirban Mandal (Renaissance Computing Institute (
 RENCI), University of North Carolina)
END:VEVENT
END:VCALENDAR
