Close

Presentation

An End-to-End HPC Framework for Dynamic Power Objectives
DescriptionHigh-Performance Computing (HPC) centers demand a lot of power, and continue to grow through the exascale era. This work establishes the need for a multi-tiered, feedback-driven power management framework to follow dynamic power objectives while maximizing job performance, highlighting the need to respond to external factors (e.g., power constraints), and internal factors (e.g., performance variation). We present a practical implementation of this framework on a real-world cluster in addition to conducting simulations for larger data centers. We accurately track a moving power target for demand response while reacting to incomplete or inaccurate prior knowledge about job power and performance properties. We demonstrate that online performance feedback from a job runtime enables a cluster power management policy to recover most of the performance degradation introduced by job-type misclassification.
Event Type
Workshop
TimeSunday, 12 November 20233:20pm - 3:45pm MST
Location603
Tags
Artificial Intelligence/Machine Learning
Energy Efficiency
Green Computing
Performance Measurement, Modeling, and Tools
Sustainability
Registration Categories
W