NEXTGenIO: Using Non-Volatile Memory to Improve I/O

Memory and I/O characteristics have a strong impact on scalability and performance of many HPC applications. In particular, as HPCWE applications with focus on wind energy move towards Exascale, I/O is likely to become one of the main challenges due to the unprecedented degree of utilised parallelism. EPCC has recently contributed to the development and evaluation [1] of NEXTGenIO – a prototype system with focus on improving memory and I/O capabilities through the use of byte-addressable non-volatile memory (NVRAM). This persistent memory fits between the main memory (RAM) and storage in the system’s memory hierarchy, offering a much larger capacity than RAM and substantially faster access than disk storage. The system has achieved 4th rank at the IO500 10-node Challenge at the Supercomputing 2019 conference, demonstrating its suitability for IO-intensive tasks. The system is available to HPCWE to explore novel approaches to scalable I/O.

NEXTGenIO Overview

The system comprises 34 nodes. Each node has 48 cores with base frequency of 2.4 GHz split across two Intel Xeon 8260M processors (Cascade Lake), 192 GB DDR4 RAM and 3TB NVRAM (using Intel Optane Data Center Persistent Memory Modules). This translates to around 66 GB memory per core, which is in stark contrast with more conventional systems which usually have less than 6GB RAM per core. Omni-Path interconnect is used between the nodes with up to 200 Gbps. There are two main memory configurations: Memory mode and AppDirect mode. Whilst in Memory mode, DRAM acts as the last level cache and NVRAM is transparently used as main memory. This requires no application changes. By contrast, App Direct mode allows for more fine grained control of how data is allocated between DRAM and NVRAM, but in most cases requires some changes to the target application.

Figure 1: Memory Mode

Use Cases

In the Memory Use Case, applications with high memory-per-node requirements benefit from NEXTGenIO’s balance towards memory. This could enable the use of higher resolution meshes and allow a reduction in the total number of nodes required to run a specific model, thus also reducing the amount of inter-node communication.

In the Storage Use Case, applications that employ checkpointing or couple simulations via files can benefit from both the byte-addressable view on NVRAM and the persistence which facilitate fine-grained I/O. Additionally, NVRAM does not require refresh power, which may reduce power consumption needed to store read-only data.

Please refer to the SC’19 paper [1], which describes the technicalities and the use cases in more detail. Visit nextgenio.eu for further references.

Reference

[1] Michele Weiland et al., “An early evaluation of Intel’s optane DC persistent memory module and its impact on high-performance scientific applications”, SC’19, https://dl.acm.org/doi/10.1145/3295500.3356159