* * * * REGISTRATION: Now closed - course full.
Current and future supercomputing architectures face a dramatic growth of parallelism and heterogeneity on multiple levels. As a result, it is close to impossible for researchers being code developers to predict which parts of their code perform well, which development decisions impact the scalability, which choice of data structures are reasonable for a specific architecture, and so forth. Most decisions are based upon experience, upon intuition and upon a limited understanding of the code's performance.
To get a better understanding of code performance and to guide performance engineering, it is essential for computational scientists and engineers to conduct measurements in order to study the code performance in detail. Performance analysis tools, a generalisation of the classic profiler, are the right tools to obtain this insight. However, they themselves require a decent level of understanding, experience and expertise to be used economically – which adds to the hardness and complexity of the underlying problem. This workshop introduces several performance analysis tools and provides hands-on training.
The workshop will be given by Sameer Shende (University of Oregon), Marc-André Hermanns (Juelich Supercomputing Centre), Joachim Protze (RWTH Aachen) and Florent Lebeau (Allinea Software Ltd).
Computational scientists and researchers that already use high performance computing (HPC) facilities for computationally demanding challenges, that are actively involved in HPC code development, and who can directly benefit in their research from a better understanding of why their code performs in a certain way.
Three days (Wed 6 Jul - Fri 8 Jul)
Each day will be split into three sessions, lectures and hands-on exercises followed by an afternoon session that gives attendees an opportunity to apply previously discussed tools to their own research codes.
TAU, Scalasca, Score-P, Cube, MUST, ARCHER, Allinea MAP + PerfReport
The lecture and hands-on exercise sessions will require a screen and a projector that can be connected to a laptop.
Access to a whiteboard will also be required.
The attendees will typically bring their own laptops to use for the exercises and for applying the performance tools to their own research codes; alternatively, they could use the provided workstations.
Internet access will also be required so that instructors/attendees can access their ARCHER/DiRAC HPC accounts.
Presentations and hands-on sessions are on the following topics:
- TAU performance system
- Score-P instrumentation and measurement
- Scalasca automated trace analysis
- MUST runtime error detection for MPI
- ARCHER runtime error detection for OpenMP
A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.
The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks for lunch and refreshments. There is no fee for participation, however, participants are responsible for their own travel and accommodation.
Classroom capacity is limited, therefore priority will be given to applicants with MPI, OpenMP and hybrid OpenMP+MPI parallel codes already running on the workshop computer systems, and those bringing codes from similar systems to work on.
The workshop introduces the open-source community-developed Score-P instrumentation and measurement infrastructure, together with the Scalasca and TAU tools, in order to provide a practical basis for portable performance analysis of parallel applications. The workshop will be delivered as a series of presentations with associated hands-on practical exercises, starting with basic application instrumentation and measurement to generate execution profiles, then improving measurement quality via customization capabilities, and progressing to interactive and automated analyses of execution traces.
While analysis of provided example codes will be used to help familiarise the class with the various measurement/diagnostic tools, coaching will also be available to assist participants in analysing their own parallel application codes and thereby reveal opportunities for improving execution performance and scalability.
For the (preliminary) programme: www.vi-hps.org/training/tws/tw22.html