Please note: This course takes place in Durham.
One of the greatest challenges to running parallel applications on
large numbers of processors is how to handle file IO. Standard IO
routines are not designed with parallelism in mind, and IO overheads
can grow to dominate the overall runtime. Parallel file systems are
optimised for large data transfers, but performance can be far from
optimal if every process opens its own file or if all IO is funneled
through a single master process.
This hands-on course explores a range of issues related to parallel
IO. It uses ARCHER and its parallel Lustre file system as a platform
for the exercises; however, almost all the IO concepts and performance
considerations are applicable to any parallel system.
The IO part of the MPI standard gives programmers access to efficient
parallel IO in a portable fashion. However, there are a large number
of different routines available and some can be difficult to use in
practice. Despite its apparent complexity, MPI-IO adopts a very
straightforward high-level model. If used correctly, almost all the
complexities of aggregating data from multiple processes can be dealt
with automatically by the library.
The first day of the course will cover the MPI-IO standard, developing
IO routines for a regular domain decomposition example. It will also
briefly cover higher-level standards such as HDF5 and NetCDF which are
built on top of MPI-IO.
The second day will concentrate on performance, covering how to
configure the parallel file system and tune the MPI-IO library for
best performance. Case studies from real codes will be presented.
Prerequisites: The course assumes a good understanding of basic MPI
programming in Fortran, C or C++. Knowledge of MPI derived datatypes
would be useful but not essential.
Wednesday 29 March
09:30 - 10:15 : Parallel IO
10:15 - 11:00 : Practical : Basic IO
11:00 - 11:30 : Break
11:30 - 12:15 : Derived Datatypes for MPI-IO
12:15 - 13:00 : Practical: Derived Datatypes
13:00 - 14:00 : Lunch
14:00 - 14:45 : Basic MPI-IO Routines
14:45 - 15:30 : Practical: Basic MPI-IO
15:30 - 16:00 : Break
16:00 - 16:45 : MPI-IO Features and alternative libraries
16:45 - 17:30 : Practical : Alternative Libraries
Thursday 30th March
09:30 - 10:15 : Lustre file system on ARCHER
10:15 - 11:00 : Practical: Lustre configuration
11:00 - 11:30 : Break
11:30 - 12:15 : Parallel IO libraries on ARCHER
12:15 - 13:00 : Practical: tuning parallel IO
13:00 - 14:00 : Lunch
14:00 - 14:45 : Case studies
14:45 - 15:30 : Individual consultancy session