DEEP-SEA: Adapting all levels of the software stack

The 4th member of the DEEP project series focusses on the question how future exascale systems can be used in an easy – or partly automated – manner, while at the same time being as energy efficient as possible.

Systems and applications are rapidly getting more complex. To make the best use of the available resources, these must be dynamically assigned according to the needs of the application. Furthermore, programming models and tools must enable efficient sharing and exploitation of the heterogeneous computing capabilities. DEEP-SEA will adapt all levels of the software stack, including low-level drivers, compilers, computation and communication libraries, programming abstractions (such as MPI and OpenMP) and associated runtime systems, middleware and resource management systems, with an emphasis on supporting node-level memory heterogeneity and system-wide compute heterogeneity.


DEEP-EST: Resource management & job scheduling

In the DEEP-EST project the focus in system software development lied with resource management and job scheduling. The developments in this area enabled to determine optimal resource allocations for each combination of workloads, supported adaptive scheduling and enabled dynamical reservation of the resources. Added to the modularity of the system itself, this guaranteed maximum usage of the overall system, since no component is “blocked” for an application if it is not utilised.

Furthermore, the DEEP‑EST unified programming environment, based on standard components such as MPI and OmpSs but also on the emerging programming paradigms of the data-analytic fields like map-reduce, provided a model that fully supports applications to use combinations of heterogeneous nodes, and enables developers to easily adapt and optimise their codes while keeping them fully portable.

With DEEP-EST, a prototype was established that leverages the benefits of a Modular Supercomputing Architecture (MSA).


DEEP-ER: Resiliency & highly scalable I/O

Within the DEEP-ER project the system software stack was extended to feature a highly scalable, efficient, and user-friendly parallel I/O system based on the Fraunhofer parallel file system BeeGFS (formerly known as FhGFS). Additionally, it provided a low overhead, unified user-level checkpointing system and exploited the multiple levels of non-volatile memory and storage added to the DEEP architecture.


DEEP: Laying the ground for the DEEP programming environment

The DEEP project laid the ground for the DEEP programming environment. After all, programming heterogeneous systems is often considered challenging and cumbersome. Therefore, the DEEP programming environment provided a dedicated development and runtime system which makes porting applications to DEEP project prototypes a snap.

 

Driven by co-design, DEEP-SEA aims at shaping the programming environment for the next generation of supercomputers, and specifically for the Modular Supercomputing Architecture (MSA). To achieve this goal, all levels of the programming environment are considered.

Read more ...

High frequency monitoring and operational data analytics in the DEEP projects is performed by the Data Center Data Base (DCDB) and its data anlytics framework Wintermute.

On a Modular Supercomputer Architecture (MSA) system, applications will either use resources within a single module only, or run across different modules either at the same time, or successively in a workflow like model. This requires scalable scheduling and co-allocation of resources for jobs within and across modules.

The DEEP projects actively develop software in key areas such as programming software, I/O, resiliency, benchmarking, tools, resource management, and job scheduling. Most of the developments are open source.

Benchmarking is an essential element in evaluating the success of a hardware prototyping project. In the DEEP projects we use the JUBE benchmarking environment to assess the performance of the DEEP system.