February 5, 2020
VŠB - Technical University Ostrava, IT4Innovations building
CET timezone
For academia only


The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. You experience C/C++ application acceleration by:

  • Accelerating CPU-only applications to run their latent parallelism on GPUs
  • Utilizing essential CUDA memory management techniques to optimize accelerated applications
  • Exposing accelerated application potential for concurrency and exploiting it with CUDA streams
  • Leveraging command line and visual profiling to guide and check your work.

This training is a part of NVIDIA AI & HPC ACADEMY 2020.

The lectures are interleaved with many hands-on sessions using Jupyter Notebooks. The exercises will be done on a fully configured GPU-accelerated workstation in the cloud.

The workshop is co-organized by LRZ, IT4Innovations and NVIDIA Deep Learning Institute (DLI) for the Partnership for Advanced Computing in Europe (PRACE). Both IT4Innovations and LRZ, as part of GCS, are PRACE Training Centres, serve as European hubs and key drivers of advanced high-quality training for researchers working in the computational sciences.

NVIDIA DLI offers hands-on training for developers, data scientists, and researchers looking to solve challenging problems with deep learning.

All instructors are NVIDIA certified University Ambassadors.





Purpose of the course

Upon completion, you will be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You will understand an iterative style of CUDA development that will allow you to ship accelerated applications fast.

About the tutor

Dr. Momme Allalen received his Ph.D. in theoretical Physics from the University of Osnabrück in 2006. He worked in the field of molecular magnetics through modelling techniques such as the exact numerical diagonalisation of the Heisenberg model. He joined the Leibniz Computing Centre (LRZ) in 2007 working in the High Performance Computing group. His tasks include user support, optimisation and parallelisation of scientific application codes, and benchmarking for characterising and evaluating the performance of high-end supercomputers. Momme is an NVIDIA DLI certified instructor for Fundamentals of Accelerated Computing with CUDA C/C++. His research interests are various aspects of parallel computing and new programming languages and paradigms on novel HPC architectures.

NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimizing, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.


This event was partially supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project "e-Infrastruktura CZ – LM2018140“ and partially by the PRACE-6IP project - the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 823767. We would like to also thank Bayncore Labs for their contributions to this event.


VŠB - Technical University Ostrava, IT4Innovations building
Studentská 6231/1B 708 33 Ostrava–Poruba Czech Republic



After you are accepted, please create an account under courses.nvidia.com/join using the same email address as for event registration, since lab access is given based on the event registration list. Please be aware that for adminstrative reasons, after you register, Nvidia will use your email address to contact you for the final feedback of the workshop.

You must bring your own laptop to this workshop configured for wireless access.

Ensure your laptop will run smoothly by going to http://websocketstest.com/. Make sure that WebSockets work for you by seeing under Environment, WebSockets is supported and Data Receive, Send and Echo Test all check Yes under WebSockets (Port 80). If there are issues with WebSockets, try updating your browser.

Capacity and Fees

Capacity 30 participants.

The event is provided free of charge, including coffee breaks and lunches (cold snack).

Note, that this event is exclusively for verifiable students, staff, and researchers from any academic institution (for industrial participants, contact NVIDIA for industrial specific training). Please bring your student/academia id.

Accommodation and Transport recommendations

See the link above.

Application for this event is currently open.