Calculon: a Methodology and Tool for High-Level Codesign of Systems and Large Language Models
DescriptionThis paper presents a parameterized analytical performance model of transformer-based Large Language Models (LLMs) for guiding high-level algorithm-architecture codesign studies. This model derives from an extensive survey of performance optimizations that have been proposed for the training and inference of LLMs; the model's parameters capture application characteristics, the hardware system, and the space of implementation strategies. With such a model, we can systematically explore a joint space of hardware and software configurations to identify optimal system designs under given constraints, like the total amount of system memory. We implemented this model and methodology in a Python-based open-source tool called Calculon. Using it, we identified novel system designs that look significantly different from current inference and training systems, showing quantitatively the estimated potential to achieve higher efficiency, lower cost, and better scalability.
Event Type
TimeThursday, 16 November 202311am - 11:30am MST
Artificial Intelligence/Machine Learning
Performance Optimization
Programming Frameworks and System Software
Registration Categories
Reproducibility Badges