This course will be delivered as an ONLINE EVENT for remote participation
Learn how to accelerate your applications with OpenACC and CUDA, how to train and deploy a neural network to solve real-world problems, and how to effectively parallelize training of deep neural networks on Multi-GPUs.
The online workshop combines lectures about Accelerated Computing with OpenACC and CUDA with lectures about Fundamentals of Deep Learning for single and for Multi-GPUs.
The lectures are interleaved with many hands-on sessions using Jupyter Notebooks. The exercises will be done on a fully configured GPU-accelerated workstation in the cloud.
The workshop is part of PRACE Training Centres activity and co-organized by LRZ – Leibniz Supercomputing Centre (Garching near Munich) as part of Gauss Centre for Supercomputing (Germany), IT4I – National Supercomputing Center VSB Technical University of Ostrava (Czech Republic), CSC – IT Center for Science Ltd (Finland) and NVIDIA Deep Learning Institute (DLI) for the Partnership for Advanced Computing in Europe (PRACE).
The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimizing, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.
All instructors are NVIDIA certified University Ambassadors.
Dr. Momme Allalen, Dr. Juan Durillo Barrionuevo, Dr. Volker Weinberg (LRZ and NVIDIA University Ambassadors), Georg Zitzlsberger (IT4Innovations and NVIDIA University Ambassador)
Price and Eligibility: This 4 day course is OPEN and FREE of charge for academic participants from the Member States (MS) of the European Union (EU) and Associated/Other Countries to the Horizon 2020 programme.
Agenda / Learning outcomes
1st day: Fundamentals of Accelerated Computing with OpenACC (10:00-16:00 EEST | 09:00-15:00 CEST)
On the first day you learn the basics of OpenACC, a high-level programming language for programming on GPUs. Discover how to accelerate the performance of your applications beyond the limits of CPU-only programming with simple pragmas. You’ll learn:
How to profile and optimize your CPU-only applications to identify hot spots for acceleration
How to use OpenACC directives to GPU accelerate your codebase
How to optimize data movement between the CPU and GPU accelerator
Upon completion, you'll be ready to use OpenACC to GPU accelerate CPU-only applications.
2nd day: Fundamentals of Accelerated Computing with CUDA C/C++ (10:00-16:00 EEST | 09:00-15:00 CEST)
The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. On the 2nd day you experience C/C++ application acceleration by:
Accelerating CPU-only applications to run their latent parallelism on GPUs
Utilizing essential CUDA memory management techniques to optimize accelerated applications
Exposing accelerated application potential for concurrency and exploiting it with CUDA streams
Leveraging command line and visual profiling to guide and check your work
Upon completion, you’ll be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll understand an iterative style of CUDA development that will allow you to ship accelerated applications fast.
3rd day: Fundamentals of Deep Learning (10:00-16:00 EEST | 09:00-15:00 CEST)
Explore the fundamentals of deep learning by training neural networks and using results to improve performance and capabilities.
During this day, you’ll learn the basics of deep learning by training and deploying neural networks. You’ll learn how to:
- Implement common deep learning workflows, such as image classification and object detection
- Experiment with data, training parameters, network structure, and other strategies to increase performance and capability
Upon completion, you’ll be able to start solving problems on your own with deep learning.
4th day: Fundamentals of Deep Learning for Multi-GPUs (10:00-16:00 EEST | 09:00-15:00 CEST)
The computational requirements of deep neural networks used to enable AI applications like self-driving cars are enormous. A single training cycle can take weeks on a single GPU or even years for larger datasets like those used in self-driving car research. Using multiple GPUs for deep learning can significantly shorten the time required to train lots of data, making solving complex problems with deep learning feasible.
On the last day we will teach you how to use multiple GPUs to train neural networks. You'll learn:
Approaches to multi-GPUs training
Algorithmic and engineering challenges to large-scale training
Key techniques used to overcome the challenges mentioned above
Upon completion, you'll be able to effectively parallelize training of deep neural networks using TensorFlow.
- Deploy your neural networks to start solving real-world problems
Prerequisites and content level
The content level of the course is broken down as: beginner's - 20%, intermediate - 55%, advanced - 25%, community-targeted content - 0%.
Prerequisites: technical background, basic understanding of machine learning concepts, basic C/C++ or Fortran programming skills.
For the 3rd day: basics in Python will be helpful. Since Python 2.7 is used, the following tutorial can be used to learn the syntax: docs.python.org/2.7/tutorial/index.html
For the 4th day: familiarity with TensorFlow (1.x) and Keras will be a plus as used in the hands-on sessions. For those who did not use these before, you can find tutorials here: github.com/tensorflow/docs/tree/master/site/en/r1/tutorials/keras. Even though the course still uses Tensorflow 1.x, it is not critical for the content. We'll be using Horovod mainly, which would be applicable the same way for Tensorflow 2.x.
Hands-on: The lectures are interleaved with many hands-on sessions using Jupyter Notebooks. The exercises will be done on a fully configured GPU-accelerated workstation in the cloud.
After you are accepted, please create an account under courses.nvidia.com/join .
Ensure your laptop / PC will run smoothly by going to http://websocketstest.com/
Make sure that WebSockets work for you by seeing under Environment, WebSockets is supported and Data Receive, Send and Echo Test all check Yes under WebSockets (Port 80). If there are issues with WebSockets, try updating your browser. If you have any questions, please contact Marjut Dieringer at mdieringer"at"nvidia.com.
REGISTRATION is OBLIGATORY since the details to access the online course will be provided to the registered and accepted attendees only. If you have registered to this course and you are not able to attend, please CANCEL your registration in advance by sending an email to firstname.lastname@example.org