Presentation
Adaptive Stopping Rule for Performance Measurements
DescriptionPerformance variability in complex computer systems is a major challenge for accurate benchmarking and performance characterization, especially for tightly-coupled large-scale high-performance computing systems. Point summaries of performance may be both uninformative, if they do not capture the full richness of its behavior, and inaccurate, if they are derived from an inadequate sample set of measurements. Determining the correct sample size requires balancing tradeoffs of computation, methodology, and statistical power.
We treat the performance distribution as the primary target of the performance evaluation, from which all other metrics can be derived. We propose and evaluate a meta-heuristic that dynamically characterizes the performance distribution, determining when enough samples have been collected to approximate the true distribution. Compared to fixed stopping criteria, this adaptive method can be more efficient in resource use and more accurate. Importantly, it requires no advance assumptions about the system under test or its performance characteristics.
We treat the performance distribution as the primary target of the performance evaluation, from which all other metrics can be derived. We propose and evaluate a meta-heuristic that dynamically characterizes the performance distribution, determining when enough samples have been collected to approximate the true distribution. Compared to fixed stopping criteria, this adaptive method can be more efficient in resource use and more accurate. Importantly, it requires no advance assumptions about the system under test or its performance characteristics.