Scalasca Trace Tools

The Scalasca Trace Tools are a collection of trace-based performance analysis tools built on top of the Score-P instrumentation and measurement infrastructure. They are available as open-source under the 3-clause BSD license and have been specifically designed for use on large-scale HPC systems. A distinctive feature of the Scalasca Trace Tools is its scalable automatic trace-analysis component, which provides the ability to identify wait states that occur, for example, as a result of unevenly distributed workloads. Besides merely identifying and quantifying wait states in communication and synchronisation operations, the trace analyser is also able to pinpoint their root causes as well as their impact. In addition, the analyser can identify the activities on the critical path of the target application, highlighting those routines which constitute the best candidates for optimisation.


To enable the analysis of huge amounts of OTF2 event trace data that can be produced at large scales, the Scalasca Trace Tools are designed as parallel programs requiring the same amount of resources (i. e., the number of processes and threads) as the target application. The analysis tools employ a parallel replay technique that re-enacts the communication and synchronisation operations performed by the target application using operations of similar type. This effectively exploits the memory and processing capabilities of the HPC system, and thus, is the key for achieving scalability. Finally, the result is written to disk as an enriched call-path profile in CUBE4 format including additional higher-level metrics, which can be examined using the same tools (i. e., Cube, TAU ParaProf/PerfExplorer, and Extra-P) than the run-time profiles produced by Score-P.

Goals within DEEP-SEA

For the Scalasca Trace Tools, the objective for DEEP-SEA is to increase the MSA awareness of its analysis capabilities. In particular, we will extend the analysis to distinguish between intra- and inter-module wait states for MSA experiments, which will allow for a more in-depth understanding of the application behaviour and can provide guidelines for an improved resource distribution. To enable this, the Score-P measurement system needs to interact with the MSA-aware MPI implementation ParaStation MPI and/or the resource manager to query the module each application process is running on, and write it into the generated trace data. Finally, since MPI inter-communicators are more frequently used in MPMD and MSA settings, we plan to implement full support for inter-communicators throughout the whole toolchain, including measurement support in Score-P, extensions to the OTF2 trace file format, and extended analyses in the Scalasca Trace Tools.