This page gives an overview on all academic publications related to the DEEP projects.

G. Katevenis, M. Ploumidis, M. Marazakis
August 10, 2023

Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors

M. Copik, R. Böhringer, A. Calotoiu, T. Hoefler
May 15, 2023

FMI: Fast and Cheap Message Passing for Serverless Functions

M. Pavlidakis, S. Mavridis, A. Chazapis, G. Vasiliadis, and A. Bilas
May 2, 2023

Arax: A Runtime Framework for Decoupling Applications from Heterogeneous Accelerators

D. Álvarez and V. Beltran
April 24, 2023

Optimizing Iterative Data-flow Scientific Applications using Directed Cyclic Graphs

M. Maroñas, A. Navarro, E. Ayguadé and V. Beltran
April 6, 2023

Mitigating the NUMA Effect on Task-Based Runtime Systems

J. Aguilar Mena, O. Shaaban, V. Beltran, P. Carpenter, E. Ayguade, and J. Labarta
January 13, 2023

Transparent load balancing of MPI programs using OmpSs-2@cluster and DLB

F. Czappa, A. Geiß, F. Wolf
January 1, 2023

Simulating Structural Plasticity of the Brain more Scalable than Expected

J. Aguilar Mena
November 23, 2022

Methodology for malleable applications on distributed memory systems

C. Tassadit Ait Kaci, M. Sergent, E. Saillard, D. Barthou
November 15, 2022

Static Local Concurrency Errors Detection in MPI-RMA Programs

G. Katevenis-Bitzos, M. Ploumidis, and M. Marazakis
September 6, 2022

A framework for hierarchical single-copy MPI collectives on multicore nodes

M.I. Andersson, N. A. Murugan, A. Podobas and S. Markidis
August 29, 2022

Breaking Down the Parallel Performance of GROMACS, a High-Performance Molecular Dynamics Software

J. Aguilar Mena, O. Shaaban, V. Beltran, P. Carpenter, E. Ayguade, and J. Labarta
August 22, 2022

OmpSs-2@Cluster: Distributed memory execution of nested OpenMP-style tasks

H. Nöttgen, F. Czappa, and F. Wolf
August 1, 2022

Accelerating Brain Simulations with the Fast Multipole Method

A. Navarro, A. F. Lorenzon, E. Ayguadé, V. Beltran

Combining Dynamic Concurrency Throttling with Voltage and Frequency Scaling on Task-based Programming Models

O. Rausch, T. Ben-Nun, N. Dryden, A. Ivanov, S. Li, T. Hoefler
June 28, 2022

A Data-Centric Optimization Framework for Machine Learning

A. Calotoiu, T. Ben-Nun, T. Schneider, J. de Fine Licht, T. Hoefler
June 22, 2022

Lifting C Semantics for Dataflow Optimization

P. Schaad, T. Ben-Nun, T. Hoefler
June 21, 2022

Boosting Performance Optimization with Interactive Data Movement Visualization

S. Di Girolamo, D. De Sensi, K. Taranov, M. Malesevic, M. Besta, T. Schneider, S. Kistler, T. Hoefler
June 20, 2022

Building Blocks for Network-Accelerated Distributed File Systems

A. Nikolaos Ziogas, G. Zygmunt Kwasniewski, T. Ben-Nun, T. Schneider, T. Hoefler
June 16, 2022

Deinsum: Practically I/O Optimal Multilinear Algebra

T. Ben-Nun, L. Groner, F. Deconinck, T. Wicky, E. Davis, J. Dahm, O. Elbert, R. George, J. McGibbon, L. Trümper, E. Wu, O. Fuhrer, T. Schulthess and T. Hoefler
May 9, 2022

Productive Performance Engineering for Weather and Climate Modeling with Python