Towards understanding HPC users and systems: A NERSC case study

Citation data:

Journal of Parallel and Distributed Computing, ISSN: 0743-7315, Vol: 111, Page: 206-221

Publication Year:
2018
Usage 12
Abstract Views 12
Captures 10
Readers 10
Social Media 6
Tweets 6
Citations 4
Citation Indexes 4
DOI:
10.1016/j.jpdc.2017.09.002
Author(s):
Gonzalo P. Rodrigo; P.-O. Östberg; Erik Elmroth; Katie Antypas; Richard Gerber; Lavanya Ramakrishnan
Publisher(s):
Elsevier BV
Tags:
Computer Science; Mathematics
Most Recent Tweet View All Tweets
article description
High performance computing (HPC) scheduling landscape currently faces new challenges due to the changes in the workload. Previously, HPC centers were dominated by tightly coupled MPI jobs. HPC workloads increasingly include high-throughput, data-intensive, and stream-processing applications. As a consequence, workloads are becoming more diverse at both application and job levels, posing new challenges to classical HPC schedulers. There is a need to understand the current HPC workloads and their evolution to facilitate informed future scheduling research and enable efficient scheduling in future HPC systems. In this paper, we present a methodology to characterize workloads and assess their heterogeneity, at a particular time period and its evolution over time. We apply this methodology to the workloads of three systems (Hopper, Edison, and Carver) at the National Energy Research Scientific Computing Center (NERSC). We present the resulting characterization of jobs, queues, heterogeneity, and performance that includes detailed information of a year of workload (2014) and evolution through the systems’ lifetime (2010–2014).