6-8 July 2016
N.B. This course will take place in University of Cambridge
Europe/London timezone

* * * *   REGISTRATION: Now closed - course full.

Current and future supercomputing architectures face a dramatic growth of parallelism and heterogeneity on multiple levels. As a result, it is close to impossible for researchers being code developers to predict which parts of their code perform well, which development decisions impact the scalability, which choice of data structures are reasonable for a specific architecture, and so forth. Most decisions are based upon experience, upon intuition and upon a limited understanding of the code's performance.


To get a better understanding of code performance and to guide performance engineering, it is essential for computational scientists and engineers to conduct measurements in order to study the code performance in detail. Performance analysis tools, a generalisation of the classic profiler, are the right tools to obtain this insight. However, they themselves require a decent level of understanding, experience and expertise to be used economically – which adds to the hardness and complexity of the underlying problem. This workshop introduces several performance analysis tools and provides hands-on training.


The workshop will be given by Sameer Shende (University of Oregon), Marc-André Hermanns (Juelich Supercomputing Centre), Joachim Protze (RWTH Aachen) and Florent Lebeau (Allinea Software Ltd).


Target Audience

Computational scientists and researchers that already use high performance computing (HPC) facilities for computationally demanding challenges, that are actively involved in HPC code development, and who can directly benefit in their research from a better understanding of why their code performs in a certain way.



Three days (Wed 6 Jul - Fri 8 Jul)



Each day will be split into three sessions, lectures and hands-on exercises followed by an afternoon session that gives attendees an opportunity to apply previously discussed tools to their own research codes.


Topics covered

TAU, Scalasca, Score-P, Cube, MUST, ARCHER, Allinea MAP + PerfReport


Taught using

The lecture and hands-on exercise sessions will require a screen and a projector that can be connected to a laptop.

Access to a whiteboard will also be required.

The attendees will typically bring their own laptops to use for the exercises and for applying the performance tools to their own research codes; alternatively, they could use the provided workstations.

Internet access will also be required so that instructors/attendees can access their ARCHER/DiRAC HPC accounts.


Programme Overview

Presentations and hands-on sessions are on the following topics:

  • TAU performance system
  • Score-P instrumentation and measurement
  • Scalasca automated trace analysis
  • MUST runtime error detection for MPI
  • ARCHER runtime error detection for OpenMP

A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.

The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks for lunch and refreshments. There is no fee for participation, however, participants are responsible for their own travel and accommodation.

Classroom capacity is limited, therefore priority will be given to applicants with MPI, OpenMP and hybrid OpenMP+MPI parallel codes already running on the workshop computer systems, and those bringing codes from similar systems to work on.


The workshop introduces the open-source community-developed Score-P instrumentation and measurement infrastructure, together with the Scalasca and TAU tools, in order to provide a practical basis for portable performance analysis of parallel applications. The workshop will be delivered as a series of presentations with associated hands-on practical exercises, starting with basic application instrumentation and measurement to generate execution profiles, then improving measurement quality via customization capabilities, and progressing to interactive and automated analyses of execution traces.

While analysis of provided example codes will be used to help familiarise the class with the various measurement/diagnostic tools, coaching will also be available to assist participants in analysing their own parallel application codes and thereby reveal opportunities for improving execution performance and scalability.

For the (preliminary) programme: www.vi-hps.org/training/tws/tw22.html

Starts 6 Jul 2016 09:00
Ends 8 Jul 2016 17:30
N.B. This course will take place in University of Cambridge
New Museums Site
New Museums Site, CB2 3QH

This course is being run by EPCC, as part of ARCHER. The workshop is organised in cooperation with the VI-HPS: The Virtual Institute - High Productivity Supercomputing (VI-HPS; http://www.vi-hps.org) is a consortium of world-leading developers of parallel program developer tools.

The course will be organised by ICC and ECS at University of Cambridge in collaboration with DiRAC and PATC. It is held by members of VI-HPS. Access to Hamilton and COSMA will be provided on a prioritised level throughout the course. PATC participants will at the same time get access to Archer. The course focuses on Scalasca and associated tools, i.e. Score-P and Cube (largescale, scaling performance analysis and guidance on optimisation; http://www.scalasca.org/) as well as MUST, with an emphasis on MPI+x (x being typically OpenMP or pthreads).

PATC courses are available free of charge to all attendees (academic and industrial).

Please register using the online form. If you have any questions, please consult the course forum page or contact epcc-support@epcc.ed.ac.uk.