Runtime systems, thread programming, accelerators, hardware abstraction, performance portability.
This course will present the state of the art of runtime system support for programming heterogeneous platforms. Heterogeneous computing platforms—such as multicores equipped with accelerators—are notoriously difficult to program due to the strong differences in performance characteristics among the various available computing units and also to the discrete memory spaces of accelerating boards.
The course will present the StarPU runtime system developed at Inria by the STORM Team in Bordeaux. It will also present the hardware locality library hwloc for discovering hardware resources and the TreeMatch framework (Bordeaux / Tadaam) for distributed processes placement, and the Ezperf framework for solving performance issues (SED Bordeaux).
Participants will understand the benefit of the task-based programming model together with performance modeling, automatic data management and dynamic scheduling in speeding up application development on heterogeneous computing platforms and providing long term performance portability.
Basic knowledge of C programming language and accelerator programming languages (nVidia Cuda, OpenCL).