Phase 4 - DEEP-SEA
The latest undertaking in the family of DEEP projects is DEEP-SEA. DEEP-SEA is building upon proven software packages to create an open-source environment optimally supporting heterogeneous and modular supercomputers, guided by codesign with applications from seven high-impact scientific fields. Objectives of the project are:
- Co-design the software- and programming environment of the upcoming European exascale systems.
- Provide tools to map complex applications and non-uniform workflows onto heterogeneous and modular computer architectures.
- Enhance the system software, programming paradigms, tools, and runtimes in order to extract the maximum performance from heterogeneous computer platforms and improve performance portability.
- Improve the use and management of new memory technologies and the placement of data in compute devices with deep and heterogeneous memory hierarchies.
- Release the DEEP-SEA software stack in production-ready quality to enable its operation and exploitation in upcoming European exascale systems.
Phase 3 - DEEP-EST
Being the third in the family of DEEP projects, DEEP-EST builds on technologies and concepts developed in its two predecessors while striving to achieve these objectives:
- Develop an energy efficient system architecture that fits High Performance Computing (HPC) and High Performance Data Analytics (HPDA) workloads, and satisfies the requirements of end-users as well as e-infrastructure operators.
- Build a fully working Modular Supercomputing Architecture (MSA) system prototype made-up of three modules: the Cluster module, the Extreme Scale Booster module and the Data Analytics module.
- Foster European technologies
- Build a resource management and scheduling system fully supporting the MSA, which will be able to schedule heterogeneous workloads onto matching combinations of module resources.
- Enhance and optimise programming models, based on MPI and OmpSs will be enhanced and optimised to best leverage the DEEP‑EST prototype.
- Validate the full hardware (HW) / software (SW) stack with relevant HPC and extreme data workloads and demonstrate the benefits of the MSA.
Phase 2 - DEEP-ER
Building on the concepts and results of the predecessor project DEEP, within the DEEP-ER project we focus on the following objectives:
- DEEP-ER extends the Cluster-Booster architecture of the DEEP project by a highly scalable I/O system and implements an efficient mechanism to recover application tasks that fail due to hardware errors.
- The project leverages new memory technology to provide increased performance and power efficiency. As a result, I/O-intensive HPC codes will run faster and exploit higher scalability. HPC applications will be able to profit from sophisticated checkpointing and task restart techniques reducing overhead seen today, even on large-scale systems.
- DEEP-ER builds a prototype based on the second generation Intel® Xeon Phi processor, a uniform high-speed interconnect across Cluster and Booster, non-volatile memory on the compute nodes, and network attached memory providing high-speed shared storage.
- A highly scalable and efficient I/O system based on Fraunhofer’s BeeGFS file system supports I/O intensive applications, using the optimised I/O middleware SIONlib and Exascale10. A multi-level checkpoint scheme exploits the powerful I/O subsystem and the fast, network-attached storage to reduce the overhead of saving state for long-running tasks.
- The OmpSs based DEEP programming model governs the creation of checkpoints and seamlessly restarts failed tasks.
- Seven important HPC applications are optimised demonstrating the usability, performance and resiliency of the DEEP-ER Prototype.
Phase 1 - DEEP
The DEEP project, as the initial one, focused on the following objectives:
- The DEEP project follows a true Co-Design approach that reaches from hardware to middleware/systemware to tools to applications.
- Within the project a prototype for a heterogeneous hardware platform is developed consisting of a Cluster element based on multi-core-chips, a Booster element based on many-core technology and a commensurate connectivity.
- The team develops a reliable, open source operating system, interconnect and runtime software stack with high resilience while exploiting millions of cores.
- To achieve high productivity and unprecedented scalability programming models, scientific libraries and performance tools for standard x86-based many-core processors are developed.
- Current cluster energy efficiency is improved by an order of magnitude exploiting novel many-core chip technologies and advanced software-aided cooling technologies.
- Six scientific applications have been carefully chosen to validate the Cluster-Booster concept. They are representative for future Exascale computing and data handling requirements.