[ONLINE] Workshop on High-performance Data Analytics @ENCCS/HiDALGO

Europe/Stockholm
Online

Online

Description

General description and learning outcomes

We would like to invite you to participate in our High-performance data analytics course where we will introduce different tools and methods for Big data handling. The tools will be presented on two different use cases but can be applied to any data.

HiDALGO (https://hidalgo-project.eu/) – HPC and Big Data Technologies for Global Systems – is a European project funded by the Horizon 2020 Framework Programme of the European Union. The project is carried out by 13 institutions from seven countries. 

This training event will start with an introductory talk to provide a view of high-performance data analytics (HPDA) from the HiDALGO perspective. The main concepts will be presented, listing the tools that have been used, together with information about benchmarks the consortium has done (as a source of information about their scalability). This introduction also presents how these tools are being applied in HiDALGO, in order to solve different problems.

The following part of the training will focus on HPC and HPDA technologies, applied to use-cases such as Urban Air Pollution (UAP). The UAP application is a software framework for modeling the vehicular traffic emitted air pollution and its dispersion at very high resolution by using geometry inputs (Open Street Map), coupled weather data (ECMWF) and traffic simulation (SUMO), computational fluid dynamics (CFD) tools running on HPC infrastructures (OpenFOAM), and evaluation with HPDA methods.

This HPC/HPDA/UAP-part of the training will introduce the UAP concept, workflows, implementations, application of the CFD-module in HPC environment, deployment to HPC, running, and evaluation. Participants will learn the techniques of these parts from a general perspective, namely, HPC workflow modeling (TOSCA in YAML rendering), basics of OpenFOAM for computation of air pollutant dispersion using HPC, and the applied HPDA methods for fast evaluation.

The last part will provide an introduction to the data available at ECMWF and Copernicus, and the APIs for retrieving the data, followed by practical sessions on data exploration and manipulation. After this web-seminar, participants will be able to independently discover weather, climate, and environmental data produced and hosted by ECMWF, and also to retrieve and process these data using Python libraries.

For whom is the workshop

Researchers, practitioners, and developers who are interested in implementation of HPC workflows and HPDA. Environmental scientists that would like to apply microscale models and exploit the strength of HPC easily.

The part provided by ECMWF is aimed in particular at researchers that would like to use weather, climate, or environmental data in their work.

Prerequisites

  • Participants are expected to have some basic knowledge about Big Data technologies (although not mandatory).
  • Participants need to have basic knowledge of Linux CLI for the developer parts of the UAP training. For the application side, basic environmental knowledge related to air pollution needed (although not mandatory).
  • For the part related to weather, participants are expected to have basic Python knowledge and be comfortable using Jupyter Notebooks. They will have the option to either use the mybinder.org platform or work locally. The github repository with the notebooks and a list of libraries will be provided before the workshop.

Instructors

Zoltán Horváth, Ákos Kovács, László KörnyeiMátyás Constans, Széchenyi István University, Győr
Milana Vuckovic, European Centre for Medium-Range Weather Forecasts

Agenda

For the updated agenda and information on the sessions check the event's webpage at 
https://enccs.se/events/2021/04/enccshidalgo-workshop-on-high-performance-data-analytics/

The agenda of this meeting is empty