This short course will provide an introduction to GPU computing with CUDA aimed at scientific application programmers. The course will give a background on the difference between CPU and GPU architectures as a prelude to introductory exercises in CUDA programming. The course will discuss the execution of kernels, memory management, and shared memory operations. Common performance issues are discussed and their solution addressed. The course will also cover some of the alternatives to CUDA commonly available (OpenCL, OpenACC, and Kokkos) at the current time.
Note: this course will not address machine learning or any machine learning frameworks.
Learning Outcomes
At the end of the course, attendees should be in a position to make an informed decision on how to approach GPU parallelisation in their applications in an efficient and portable manner.
Pre-requisites
Attendees must be familiar with programming in C or C++ (a number of the baseline CUDA exercises are also available using CUDA Fortran). Some knowledge of parallel/threaded programming models would be useful. Access to a GPU machine will be supplied.
Requirements:
Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on.
They are also required to abide by the ARCHER2 Code of Conduct.
Timetable:
Provisional
- 10:00 Introduction
- 10:20 GPU Concepts/Architectures
- 11:00 Break
- 11:20 CUDA Programming
- 12:00 A first CUDA exercise
- 13:00 Lunch
- 14:00 CUDA Optimisations
- 14:20 Optimisation Exercise
- 15:00 Break
- 15:20 Constant and Shared Memory
- 16:00 Exercise
- 17:00 Close
Location:
This course will take place face-to-face at The Open University, Milton Keynes
This course will not be streamed online and a recording will not be made.