Energy efficient algorithms
The energy use of HPC systems is an important consideration at the exascale. In order to meet the 20MW power target, the entire HPC environment (from the datacentre to the hardware and the software) needs to be made more energy efficient; this includes the scientific applications.
There is an important distinction between energy efficient and energy aware algorithms: an energy efficient algorithm will target minimal power consumption, but is effectively static; an energy aware algorithm on the other hand will have some knowledge of power consumption (be this at runtime or from a database that is populated post-execution) and can adapt runtime parameters to reduce energy use. Additionally the contradiction between entropy production by doing fast numerical calculations and the time to solution has to be considered. It might be that a numerical task could be solved by a slow running procedure at a minimum of energy consumed compared to a fast running solution at a high-energy rate. A metric or cost model that takes both time and energy to solution into account has to be developed. An exascale-system is designed to run large problems in as short a time as possible, however energy consideration may not always make this possible and acceptable trade-offs must be found.
Many algorithms that are used in numerical modelling today have a long legacy and are known to work well on systems with limited parallelism. In order to reach the exascale however there needs to be a break in the status quo; many of these algorithms need to be redesigned from the ground up to expose further, massive parallelism and exploit the strength of an exascale system. This presents an ideal opportunity to also include energy considerations into algorithm development.
The limitations of an exascale system, as projected today, will primarily be in the data movement aspects of an application: reading from and writing to memory and disk; and moving data across the network. Floating point operations on the other hand will be 'cheap' in terms of time and energy and keeping the processing cores busy will be a major challenge. Exascale algorithms should minimise data movement and increase computation, thus increasing the 'Instructions per Cycle' (IPC) rates. These computational challenges align with the power challenges: accessing data from cache, memory or the network has higher energy costs compared to integer or floating-point operations.
At the same time energy consumption can also be reduced by changing the frequency at which computing elements are running. This might even be possible without reducing the computational performance to a large extent. For many cycles, a core is not doing useful work because it is waiting for data coming from memory or remote nodes, or it is waiting in synchronisation barriers. However 'active' waiting can be achieved with lower frequency and lower power without limiting the computational performance.
An important pre-requisite to developing energy efficient or aware algorithms is the ability to measure power. Several in-band and out-of-band solutions, with varying levels of accuracy and resolution, exist: from node-level measurements on Cray systems (starting with the XC30 range), to Intelligent Platform Management Interface (IPMI) and Running Average Power Limit (RAPL) reports, to plug-in power measurement boards which are able to measure the power use of different system components (CPU, accelerators, memory, network, disks) separately. HLRS, for instance, uses a measurement system with high frequency sampling for power consumption on a small cluster. This system consists of a two double socket nodes with FDR Infiniband interconnect and an independent measurement system to collect power consumption of various components (CPUs, GPU, per-node power consumption). The measurement frequency ranges from 12 kHz up to 100 kHz. The computational performance is measured at the same time. Both measurements are aligned with the code progress to localise the measurements within the program. This way it is possible to obtain insight into the energy use behaviour of the program. For a system with power consumption so large that it requires coordination with the local power distributor, it will be important not only to reduce the power consumption, but also to request power at a constant, steady level. Significant changes in the power draw must be avoided.