BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Denver
X-LIC-LOCATION:America/Denver
BEGIN:DAYLIGHT
TZOFFSETFROM:-0700
TZOFFSETTO:-0600
TZNAME:MDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0600
TZOFFSETTO:-0700
TZNAME:MST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20240116T191701Z
LOCATION:605
DTSTART;TZID=America/Denver:20231112T162600
DTEND;TZID=America/Denver:20231112T163300
UID:submissions.supercomputing.org_SC23_sess434_ws_ftxs109@linklings.com
SUMMARY:Using Benford's Law to Identify Unusual Failure Regions
DESCRIPTION:Workshop\n\nKurt Ferreira (Sandia National Laboratories, Unive
 rsity of New Mexico) and Scott Levy (Sandia National Laboratories)\n\nFaul
 t tolerance remains a key challenge for current high performance computing
  systems. Effective and efficient scheduling of mitigation methods continu
 es to be a critical issue in the face of dynamic and difficult-to-predict 
 error rates found on many systems. Using failure data from the Astra super
 computer, we examine the efficacy of a simple method to determine if a sli
 ding window of recent failures contains an unusual pattern of errors. Spec
 ifically, we investigate using Benford’s Law to predict the likelihood tha
 t the system is currently in a period of unusual failure occurrences. Whil
 e still in its initial stages, this work provides critical analysis of fai
 lure status for extreme-scale systems and a simple form of prediction for 
 determining when the scheduling of failure mitigation may be suboptimal an
 d needs to be reevaluated due to the unusual pattern of errors that are oc
 curring.\n\nTag: Fault Handling and Tolerance, Large Scale Systems\n\nRegi
 stration Category: Workshop Reg Pass\n\nSession Chairs: John Daly (US Depa
 rtment of Defense), Scott Levy (Sandia National Laboratories), and Keita T
 eranishi (Oak Ridge National Laboratory (ORNL))
END:VEVENT
END:VCALENDAR
