18-22 October 2021
CET timezone

The registration to this course will open soon.

Please, bring your own laptop. All the PATC courses at BSC are free of charge.

Course Convener: Xavier Martorell

Course Lecturers:

Judit Giménez - Performance Tools - Group Manager
German Llort - Performance Tools - Senior Researcher
Marc Jordà - Accelerators and Communications for High Performance Computing - Research Engineer
Antonio Peña - Accelerators and Communications for High Performance Computing - Senior Researcher
Javier Teruel -  Best Practices for Performance and Programmability  - Group Coordinator
Xavier Martorell - Programming Models - Parallel programming model - Group Manager

Level: Intermediate: For trainees with some theoretical and practical knowledge, some programming experience.

Advanced: For trainees able to work independently and requiring guidance for solving complex problems.

Attendants can bring their own applications and work with them during the course for parallelization and analysis.


Prerequisites: Fortran, C or C++ programming. All examples in the course will be done in C

Software requirements: Zoom (recommended), SSH client (to connect HPC systems), X Server (enabling remote visual tools).

Objectives: The objectives of this course are to understand the fundamental concepts supporting message-passing and shared memory programming models. The course covers the two widely used programming models: MPI for the distributed-memory environments, and OpenMP for the shared-memory architectures. The course also presents the main tools developed at BSC to get information and analyze the execution of parallel applications, Paraver and Extrae. It also presents the Parallware Assistant tool, which is able to automatically parallelize a large number of program structures, and provide hints to the programmer with respect to how to change the code to improve parallelization. It deals with debugging alternatives, including the use of GDB and Totalview.

The use of OpenMP in conjunction with MPI to better exploit the shared-memory capabilities of current compute nodes in clustered architectures is also considered. Paraver will be used along the course as the tool to understand the behavior and performance of parallelized codes. The course is taught using formal lectures and practical/programming sessions to reinforce the key concepts and set up the compilation/execution environment.

Attendants can bring their own applications and work with them during the course for parallelization and analysis.

Learning Outcomes: The students who finish this course will be able to develop benchmarks and applications with the MPI, OpenMP and mixed MPI/OpenMP programming models, as well as analyze their execution and tune their behaviour in parallel architectures.

Agenda: 

Sessions will be in October 18th-22nd 2021 from 9:30 – 13:00 and from 14:30 to 17:30 CET with 20’ break in between sessions and 1h30' lunch break - delivered online via Zoom

Day 1 (Monday October 18th)

Session 1 / 9:30 – 13:00 (20`rest in between)
1. Introduction to parallel architectures, algorithms design and performance parameters
2. Introduction to the MPI programming model
3. Practical: How to compile and run MPI applications

13:00 - 14:30 Lunch Break

Session 2 / 14:30 – 17:30 (20`rest in between)
1. MPI: Point-to-point communication, collective communication
2. Practical: Simple matrix computations
3. MPI: Blocking and non-blocking communications

Day 2 (Tuesday October 19th)

Session 1 / 9:30 – 13:00 (20`rest in between)
1. MPI: Collectives, Communicators, Topologies
2. Practical: Heat equation example

13:00 - 14:30 Lunch Break

Session 2 / 14:30 – 17:30 (20`rest in between)
1. Introduction to Paraver: tool to analyze and understand performance
2. Practical: Trace generation and trace analysis

Day 3 (Wednesday October 20th)

Session 1 / 9:30 – 13:00 (20`rest in between)
1. Parallel debugging in MareNostrumIII, options from print to Totalview
2. Practical: GDB and IDB
3. Practical: Totalview
4. Practical: Valgrind for memory leaks

13:00 - 14:30 Lunch Break

Session 2 / 14:30 – 17:30 (20`rest in between)
1. Shared-memory programming models, OpenMP fundamentals
2. Parallel regions and work sharing constructs
3. Synchronization mechanisms in OpenMP
4. Practical: heat diffusion in OpenMP

Day 4 (Thursday October 21st)

Session 1 / 9:30 – 13:00 (20`rest in between)
1. Tasking in OpenMP 3.0/4.0/4.5
2. Programming using a hybrid MPI/OpenMP approach
3. Practical: multisort in OpenMP and hybrid MPI/OpenMP

13:00 - 14:30 Lunch Break

Session 2 / 14:30 – 17:30 (20`rest in between)
1. Parallware: guided parallelization
2. Practical session with Parallware examples

Day 5 (Friday October 22nd)

Session 1 / 9:30 – 13:00 (20`rest in between)
1. Introduction to the OmpSs programming model
2. Practical: heat equation example and divide-and-conquer

13:00 - 14:30 Lunch Break

Session 2 / 14:30 – 17:30 (20`rest in between)
1. Programming using a hybrid MPI/OmpSs approach
2. Practical: heat equation example and divide-and-conquer

END OF TRAINING COURSE

Starts
Ends
CET

For further details and practical info such as local transport and venue please visit the local course pages for PATC@BSC: http://www.bsc.es/patc

Registration
Registration for this event is currently open.