[ONLINE] Fundamentals of Deep Learning for Multiple Data Types @ IT4Innovations





This day explores how convolutional and recurrent neural networks can be combined to generate effective descriptions of content within images and video clips. Learn how to train a network using TensorFlow and the Microsoft Common Objects in Context (COCO) dataset to generate captions from images and video by:

  • Implementing deep learning workflows like image segmentation and text generation

  • Comparing and contrasting data types, workflows, and frameworks

  • Combining computer vision and natural language processing

This course is only offered to academia (see details below in section Capacity and Fees).





Purpose of the course (benefits for the attendees)

Upon completion, you’ll be able to solve deep learning problems that require multiple types of data inputs.

About the tutor(s)

Georg Zitzlsberger is a research specialist for Machine and Deep Learning. He received his certification from Nvidia as a University Ambassador of the Nvidia Deep Learning Institute (DLI) program. This certification allows him to offer Nvidia DLI courses to academic users of IT4Innovations' HPC services.

NVIDIA Deep Learning Institute

The NVIDIA Deep Learning Institute delivers hands-on training for developers, data scientists, and engineers. The program is designed to help you get started with training, optimizing, and deploying neural networks to solve real-world problems across diverse industries such as self-driving cars, healthcare, online services, and robotics.


This course  is sponsored by NVIDIA as part of the NVIDIA Deep Learning Institute (DLI) University Ambassador program.

This event was partially supported by The Ministry of Education, Youth and Sports from the Large Infrastructures for Research, Experimental Development and Innovations project "e-Infrastruktura CZ – LM2018140“ and partially by the PRACE-6IP project - the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 823767.

    • 09:45 10:00
      Presentation 15m
    • 10:00 10:20
      Welcome and Intro
    • 10:20 12:00
      Introduction to CNNs and Object Segmentation
    • 12:00 13:00
      Lunch Break 1h
    • 13:00 14:20
      Word Generation with RNNs
    • 14:20 14:30
      Coffee Break and Group Picture 10m
    • 14:30 15:45
      Image Captioning by Combining RNNs and CNNs
    • 15:45 16:00
      Q & A