Traleika Glacier: A hardware-software co-designed approach to exascale computing

Citation data:

Parallel Computing, ISSN: 0167-8191, Vol: 64, Page: 33-49

Publication Year:
2017
Usage 32
Abstract Views 26
Clicks 4
Link-outs 2
Captures 8
Readers 8
Social Media 3
Tweets 3
DOI:
10.1016/j.parco.2017.02.003
Author(s):
Vincent Cavé; Romain Clédat; Paul Griffin; Ankit More; Bala Seshasayee; Shekhar Borkar; Sanjay Chatterjee; Dave Dunning; Joshua Fryman
Publisher(s):
Elsevier BV
Tags:
Computer Science; Mathematics
Most Recent Tweet View All Tweets
article description
The move from current petascale machines to future exascale machines will need both hardware improvements and software changes. Hardware will need to evolve to focus primarily on features that lower energy consumption: near-threshold voltage operation, fine-grained power and clock management and heterogeneity. Software will also need to evolve and be able to express more parallelism, become more dynamic and adaptable in order to be able to operate on a much more variable hardware. In this paper, we present Traleika Glacier, an effort that seeks to evaluate radical design changes to meet the constraints, both in terms of power and cost, of exascale computing. The salient features of the hardware design presented in the work include a) a use of heterogeneous cores, b) a redesign of the memory system that centers around hierarchical scratchpads and a global address space, c) the hardware acceleration of certain memory and network operations through specialized engines and, d) very fine-grained control and monitoring capabilities. On the software side, we describe a task-based runtime system, the Open Community Runtime (OCR) which aims to express a wide range of higher-level programming models with a very limited set of core concepts: event-driven tasks for computation, events for synchronization and relocatable data-blocks for data management.