Please, bring your own laptop. All the PATC courses at BSC are free of charge.
Course conveners:
Department and Research group: Computer Science - Workflows and Distributed Computing
Yolanda Becerra, Data-driven Scientific Computing research line, Senior researcher
Anna Queralt, Distributed Object Management research line, Senior researcher
Course Lecturers:
Department and Research group: Computer Sciences - Workflows and Distributed Computing
Alex Barceló, Distributed object Management research line, Researcher
Yolanda Becerra, Data-driven Scientific Computing research line, Senior researcher
Adrián Espejo, Data-driven Scientific Computing research line, Junior research engineer
Daniel Gasull, Distributed object Management research line, Research engineer
Pol Santamaria, Data-driven Scientific Computing research line, Junior developer
Anna Queralt, Distributed object Management research line, Senior researcher
Objectives:
The objective of this course is to give an overview of BSC storage solutions, Hecuba and dataClay. These two platforms allow to easily store and manipulate distributed data from object-oriented applications, enabling programmers to handle object persistence using the same classes they use in their programs, thus avoiding time consuming transformations between persistent and non-persistent data models. Also, Hecuba and dataClay enable programmers to transparently manage distributed data, without worrying about its location. This is achieved by adding a minimal set of annotations in the classes.
Both Hecuba and dataClay can work independently or integrated with the COMPSs programming model and runtime to facilitate parallelization of applications that handle persistent data, thus providing a comprehensive mechanism that enables the efficient usage of persistent storage solutions from distributed programming environments.
Both platforms offer a common interface to the application developer that facilitates using one solution or the other depending on the needs, without changing the application code. Also, both of them have additional features that allow the programmer to take advantage of their particularities.
Learning Outcomes:
In the course, the Hecuba and dataClay syntax, programming methodology and an overview of their internals will be given. Also, an overview of COMPSs at user level will be provided in order to take advantage of the distribution of data with both platforms. The attendees will get a first lesson about programming with the common storage interface that will enable them to start programming with both frameworks.
A hands-on with simple introductory exercises will be also performed for each platform, with and without COMPSs to distribute the computation. The students who finish this course will be able to develop simple Hecuba and dataClay applications and to run them both in a local resource and in a distributed platform (initially in a private cloud)
Prerequisites:
Basic programming skills in Python and Java.
Previous attendance to PATC course on programming distributed systems with COMPSs is recommended.
Agenda:
Day 1 (Jan 30)
Session 1 / 9:30 – 13:00
9:30-10:00 Round table. Presentation and background of participants
10:00-11:00 Motivation, introduction and syntax of BSC storage platforms
11:00-11:30 Coffee break
11:30-12:15 Hands-on with storage API
12:15-13:00 COMPSs overview and how to parallelize a sequential application
13:00-14:30 Lunch break
Session 2/ 14:30 – 18:00
14:30-16:00 Hecuba specifics and hands-on
16:00-16:30 Break
16:30-18:00 dataClay specifics and hands-on