6-9 February 2018
Barcelona BSC Campus Nord
CET timezone

Registration will open in December. Please, bring your own laptop. All the PATC courses at BSC are free of charge.

Course Convener:  Maria-Ribera Sancho

Objectives: The course brings together key information technologies used in manipulating, storing, and analysing data including:

  • the basic tools for statistical analysis
  • techniques for parallel processing
  • tools for access to unstructured data
  • storage solutions

Learning outcomes: Students will be introduced to systems that can accept, store, and analyse large volumes of unstructured data. The learned skills can be used in data intensive application areas.

Level: For trainees with some theoretical and practical knowledge


ay 1 06/02: 

 9:30 – 13:00 Introduction (Vassil Alexandrov)

  • Data Science current trends session will focus on results of the latest key studies both in Europe and the USA in the area of Data Science and will outline the major trends, findings and recommendations.

Coffee break 11:00- 11:30

  • Data Science definitions and mathematical foundations introduction.

While tackling Big Data problems in many cases elementary or standard statistical approaches fail. New research methods are required to be developed to tackle such problems. Therefore this session will focus key research methods and approaches for Data Science, ranging from theory creating and theory testing approaches to conceptual-analytical approaches and experimental ones, that are able to lead to discovering global properties on data. These will be mainly deterministic and hybrid (stochastic/deterministic) methods and algorithms.

14:00 – 16:00

  • This session will focus on several key methods and algorithms (both serial and parallel) that enable to discover global properties on data while dealing with Big Data:
    • Network Science
    • Multi Constrained and Multi-Objective Optimization
    • Examples of using the above approaches
  • Examples using the above approaches and some hands-on exercise

Coffee break 16:00 – 16:30 

  • Social Simulation Applications  (Josep Casanovas)

Day 2 07/02:

 9:30 – 13:00 (Josep Lluis Berral)

  • Data Analytics with Apache Spark.

Apache Spark has become a consolidated technology for large-scale processing in a fast and general way, with “programmer-friendly” interfaces and official bindings for many of the most used languages (Java, Scala, Python and R), extensive documentation and development tools. This course introduces Apache Spark, as well as some of its core libraries for data manipulation, machine learning, data streams and graph analytics.

Coffee break 11:00- 11:30

 14 :00 – 16 :00

  • Data Analytics with Apache Spark. Part 2

Coffee break 16:00 – 16:30


  • Big IoT Project (Dr. Ernest Teniente)

Day 3 8/02

9:30 – 13:00 (Albert Abelló and Petar Jovanovic)

  • Big Data Management: Big Data has many definitions and facets, we'll pay attention to the problems we have to face to store it and how we can process it. More specifically, we'll focus on the Apache Hadoop ecosystem and its two basic components, namely HBase and MapReduce engine.

Coffee break 11:00- 11:30

  • Hands-on exercise

 14:00-16:00 (Rizkallah Touma)

  • NoSQL databases: The relational model has dominated data storage systems since the mid 1970s. However, the changing storage needs over the past decade have given rise to new models for storing data, collectively known as NoSQL. In this presentation, we will focus on two of the most common types of NoSQL databases: document-oriented databases and graph databases and explain the use cases suitable for each of them.

Coffee break 16:00 - 16:30

 16:30-18:00 (Dr. Maria Cristina Marinescu)

  • Multidisciplinary research and data analytics: Smart Cities


Day 4 09/02:

 9:30 – 11:30 (Dr. Darío García)

  • Introduction to Deep Learning

Coffee break 11:30- 12:00


12:00 – 13.00 (Dr. Javier Espinosa)

  • Data visualizations are everywhere and are more important than ever. From creating a visual representation of data points as part of an executive presentation, to showcasing progress, or visualizing concepts for customer segments, data visualizations are a critical and valuable tool in many different situations. When it comes to big data, weak tools with basic features do not cut it so specific techniques should be applied. This course will address different techniques for visualizing big data collections including a vision of the visualization process as a complex and greedy task and then as out of the box solution that can help to analyze and interpret big data collection.

Coffee break 11:00- 11:30

 14:00– 18:00

  • Hands-on Exercise

Coffee break 16:00 – 16:30

  • Hands-on Exercise



Starts 6 Feb 2018 09:30
Ends 9 Feb 2018 16:30
Barcelona BSC Campus Nord
Vertex Building Room VS208

For further details and practical info such as local transport and venue please visit the local course pages for PATC@BSC: http://www.bsc.es/patc