PRACE & E-CAM Tutorial on Machine Learning and Simulations @ICHEC

Europe/Dublin
B003, Computer Science Building (University College Dublin)

B003, Computer Science Building

University College Dublin

Belfield, Dublin 4, Ireland
Description

Overview

The 4-day school will focus on providing the participants with a concise introduction to key machine and deep learning (ML & DL) concepts, and their practical applications with relevant examples in the domain of molecular dynamics (MD), rare-event sampling and electronic structure calculations (ESC). ML is increasingly being used to make sense of the enormous amount of data generated every day by MD and ESC simulations running on supercomputers. This can be used to obtain mechanistic understanding in terms of low-dimensional models that capture the crucial features of the processes under study, or assist in the identification of relevant order parameters that can be used in the context of rare-event sampling. ML is also being used to train neural network based potentials from ESC which can then be used on MD engines such as LAMMPS allowing orders of magnitude increase in the dimensionality and time scales that can be explored with ESC accuracy. So while the first half of this school will cover the fundamentals of ML and DL, the second half will be dedicated to relevant examples of how these techniques are applied in the domains of MD and ESC.

Learning outcomes

By the end of the school, participants are expected to:

  • Gain an understanding of the fundamental concepts of ML and DL, including how neural networks function, different types of topologies, common pitfalls, etc.

  • Be able to implement basic deep learning workflows using Python.

  • Leverage existing framework to discover molecular mechanisms from MD simulations.

  • Utilise the PANNA toolkit to create neural network models for atomistic systems and generate results that can be integrated with MD packages.

Prerequisites

Participants are expected to have a working knowledge of Python (i.e. familiar with the basic syntax and constructs, have used Python before for at least a few months) and have a basic understanding of the fundamental physics behind molecular dynamics simulations and electronic structure calculations. All participants are expected to bring his/her own laptop to the school to conduct hands-on exercises.

Registration

There is no registration charge accepted participants. However, all participants must register and due to limited space, in the event of high demand, participants will be selected according to expressions of interest provided.

Non-academic participants are welcome to register to the school but should notify the organisers in order to pre-empt issues with third party copyright material that will be used for parts of the school.

    • 9:30 AM 11:00 AM
      Introduction to Machine Learning and Neural Networks

      This session will cover the wider topic of machine learning, including quick overviews of fundamental techniques, e.g. regression, classification; supervised vs unsupervised learning. It will then look at methods such as k-means clustering, principal component analysis (PCA), etc. The session covers some basic concepts that are applicable to neural networks, but also conveys to the target audience that machine learning encompasses more than neural networks.

      Convener: Bruno Voisin
    • 11:00 AM 11:30 AM
      Break 30m
    • 11:30 AM 1:00 PM
      Introduction to Deep Learning

      Learn deep learning techniques for a range of computer vision tasks, including training and deploying neural networks. In this workshop, you'll: (a) implement common deep learning workflows such as Image Classification and Object Detection; (b) experiment with data, training parameters, network structure, and other strategies to increase performance and capability. (c) deploy your networks to start solving real-world problems. Upon completion, you'll be able to start solving your own problems with deep learning.

      Convener: Jony Castagna
    • 1:00 PM 2:00 PM
      Lunch 1h
    • 2:00 PM 3:30 PM
      Introduction to Deep Learning

      Learn deep learning techniques for a range of computer vision tasks, including training and deploying neural networks. In this workshop, you'll: (a) implement common deep learning workflows such as Image Classification and Object Detection; (b) experiment with data, training parameters, network structure, and other strategies to increase performance and capability. (c) deploy your networks to start solving real-world problems. Upon completion, you'll be able to start solving your own problems with deep learning.

      Convener: Jony Castagna
    • 3:30 PM 4:00 PM
      Break 30m
    • 4:00 PM 5:00 PM
      Introduction to Deep Learning

      Learn deep learning techniques for a range of computer vision tasks, including training and deploying neural networks. In this workshop, you'll: (a) implement common deep learning workflows such as Image Classification and Object Detection; (b) experiment with data, training parameters, network structure, and other strategies to increase performance and capability. (c) deploy your networks to start solving real-world problems. Upon completion, you'll be able to start solving your own problems with deep learning.

      Convener: Jony Castagna
    • 9:30 AM 11:00 AM
      Introduction to TensorFlow

      This will explore some of the fundamental concepts behind TensorFlow, an end-to-end open source platform for machine learning with an ecosystem of tools, libraries and community resources that lets researchers build and deploy ML powered applications.

      Convener: Bruno Voisin
    • 11:00 AM 11:30 AM
      Break 30m
    • 11:30 AM 1:00 PM
      Deep Learning at Scale

      These sessions cover how to use multiple GPUs to training neural networks. You'll learn: (a) approaches to multi-GPU training; (b) algorithmic and engineering challenges to large-scale training; (c) key techniques used to overcome the challenges mentioned above. Upon completion, you'll be able to effectively parallelise training of deep neural networks using TensorFlow and Horovod.

      Convener: Jony Castagna
    • 1:00 PM 2:00 PM
      Lunch 1h
    • 2:00 PM 3:30 PM
      Deep Learning at Scale

      These sessions cover how to use multiple GPUs to training neural networks. You'll learn: (a) approaches to multi-GPU training; (b) algorithmic and engineering challenges to large-scale training; (c) key techniques used to overcome the challenges mentioned above. Upon completion, you'll be able to effectively parallelise training of deep neural networks using TensorFlow and Horovod.

      Convener: Jony Castagna
    • 3:30 PM 4:00 PM
      Break 30m
    • 4:00 PM 5:00 PM
      Tutorial on advanced sampling schemes to discover molecular mechanisms from MD simulations: Lecture on scientific applications

      Exascale computing holds great opportunities for molecular dynamics (MD) simulations. However, to take full advantage of the new possibilities, we must learn how to focus computational power on the discovery of complex molecular mechanisms, and how to extract them from enormous amounts of data. Both aspects still rely heavily on human experts, which becomes a serious bottleneck when a large number of parallel simulations have to be orchestrated to take full advantage of the available computing power. Here, we use artificial intelligence (AI) both to guide the sampling and to extract the relevant mechanistic information. We combine advanced sampling schemes with statistical inference, artificial neural networks, and deep learning to discover molecular mechanisms from MD simulations. Our framework adaptively and autonomously initialises simulations and learns the sampled mechanism, and is thus suitable for massively parallel computing architectures. We propose practical solutions to make the neural networks interpretable, as illustrated in applications to molecular systems.

      Conveners: Hendrik Jung, Roberto Covino
    • 5:00 PM 5:15 PM
      Break 15m
    • 5:15 PM 6:45 PM
      Poster session and reception
    • 9:30 AM 10:30 AM
      Deep Learning at Scale

      These sessions cover how to use multiple GPUs to training neural networks. You'll learn: (a) approaches to multi-GPU training; (b) algorithmic and engineering challenges to large-scale training; (c) key techniques used to overcome the challenges mentioned above. Upon completion, you'll be able to effectively parallelise training of deep neural networks using TensorFlow and Horovod.

      Convener: Jony Castagna
    • 10:30 AM 11:00 AM
      Break 30m
    • 11:00 AM 12:30 PM
      Machine learning for computational materials science: from reaction pathways to phase diagrams

      Neural networks and other machine learning approaches have been successfully used to accurately represent atomic interaction potentials derived from computationally demanding electronic structure calculations. Due to their low computational cost, such representations open the possibility for large scale reactive molecular dynamics simulations of processes with bonding situations that cannot be described accurately with traditional empirical force fields.

      Convener: Christoph Dellago
    • 12:30 PM 1:30 PM
      Lunch 1h
    • 1:30 PM 3:00 PM
      Tutorial on advanced sampling schemes to discover molecular mechanisms from MD simulations: Hands-on session

      Exascale computing holds great opportunities for molecular dynamics (MD) simulations. However, to take full advantage of the new possibilities, we must learn how to focus computational power on the discovery of complex molecular mechanisms, and how to extract them from enormous amounts of data. Both aspects still rely heavily on human experts, which becomes a serious bottleneck when a large number of parallel simulations have to be orchestrated to take full advantage of the available computing power. Here, we use artificial intelligence (AI) both to guide the sampling and to extract the relevant mechanistic information. We combine advanced sampling schemes with statistical inference, artificial neural networks, and deep learning to discover molecular mechanisms from MD simulations. Our framework adaptively and autonomously initialises simulations and learns the sampled mechanism, and is thus suitable for massively parallel computing architectures. We propose practical solutions to make the neural networks interpretable, as illustrated in applications to molecular systems.

      Conveners: Hendrik Jung, Roberto Covino
    • 3:00 PM 3:30 PM
      Break 30m
    • 3:30 PM 5:00 PM
      Tutorial on advanced sampling schemes to discover molecular mechanisms from MD simulations: Hands-on session

      Exascale computing holds great opportunities for molecular dynamics (MD) simulations. However, to take full advantage of the new possibilities, we must learn how to focus computational power on the discovery of complex molecular mechanisms, and how to extract them from enormous amounts of data. Both aspects still rely heavily on human experts, which becomes a serious bottleneck when a large number of parallel simulations have to be orchestrated to take full advantage of the available computing power. Here, we use artificial intelligence (AI) both to guide the sampling and to extract the relevant mechanistic information. We combine advanced sampling schemes with statistical inference, artificial neural networks, and deep learning to discover molecular mechanisms from MD simulations. Our framework adaptively and autonomously initialises simulations and learns the sampled mechanism, and is thus suitable for massively parallel computing architectures. We propose practical solutions to make the neural networks interpretable, as illustrated in applications to molecular systems.

      Conveners: Hendrik Jung, Roberto Covino
    • 9:30 AM 11:00 AM
      Tutorial on the PANNA package for training and validating neural networks to represent atomic potentials: Lecture on theory & concepts

      This tutorial cover the concepts and practical experience on using PANNA (Properties from Artificial Neural Network), a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original and modified Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as the underlying engine.

      Convener: Stefano de Gironcoli
    • 11:00 AM 11:30 AM
      Break 30m
    • 11:30 AM 1:00 PM
      Tutorial on the PANNA package for training and validating neural networks to represent atomic potentials: Hands-on session

      This tutorial cover the concepts and practical experience on using PANNA (Properties from Artificial Neural Network), a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original and modified Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as the underlying engine.

      Convener: Stefano de Gironcoli
    • 1:00 PM 2:00 PM
      Lunch 1h
    • 2:00 PM 3:30 PM
      Tutorial on the PANNA package for training and validating neural networks to represent atomic potentials: Hands-on session

      This tutorial cover the concepts and practical experience on using PANNA (Properties from Artificial Neural Network), a package for training and validating neural networks to represent atomic potentials. It implements configurable all-to-all connected deep neural network architectures which allow for the exploration of training dynamics. Currently it includes tools to enable original and modified Behler-Parrinello input feature vectors, both for molecules and crystals, but the network can also be used in an input-agnostic fashion to enable further experimentation. PANNA is written in Python and relies on TensorFlow as the underlying engine.

      Convener: Stefano de Gironcoli
    • 3:30 PM 4:00 PM
      Break 30m
    • 4:00 PM 5:30 PM
      Problems proposed by participants