February 25, 2026
Vertex building
CET timezone

 BSC Training Courses are free of charge. Please bring your own laptop.

“In 1995, Wulf and McKee published a four-page note entitled Hitting the Memory Wall: Implications of the Obvious. The article projected the performance impact of the increasing speed gap between processors and memory, and predicted that, if the trends held, relative memory latencies would soon be so large that the processor would essentially always be waiting for memory — which amounts to hitting the wall.”  

“Technological evolutions and revolutions notwithstanding, the memory wall has imposed a fundamental limitation to system performance for 20 years.”

Sally McKee, 2015.
Another Trip to the Wall: How Much Will Stacked DRAM Benefit HPC?

 

Today, we are hitting the memory wall harder than ever — and most of us aren’t even aware of it.

Objectives

This course is designed to help you determine whether your application has a memory-related performance problem and what you can do about it.

We will begin with a brief overview of memory system fundamentals. This background will help you understand the hardware aspects of the memory problem — and why the situation is unlikely to improve in the near future. We’ll also discuss memory system performance metrics such as latency and bandwidth, and how they interact. To improve your application’s use of the memory system, you first need to understand how that system actually works.

The main sessions of the course will introduce tools that can help you analyze memory-related performance issues in your codes:

  • Roofline models
  • Memory stress (Mess) framework  
  • CPI stack (TopDown)

For each tool, we’ll cover its key concepts, how it views the memory system, and how it can be used to analyze applications. We’ll explore both simple illustrative examples and real-life use cases. Each tool will be covered in a dedicated one-hour session.

We’ll conclude the course with two advanced topics. First, performance prediction of future memory systems — including practical guidelines on how to estimate potential performance gains for your applications on systems using more advanced memory technologies.  Second, we’ll discuss the motivation and challenges behind heterogeneous memory systems. Comprised of devices with different capacities and performance characteristics, these systems aim to combine the best features of each component. However, over the past decade, heterogeneous memory systems have repeatedly demonstrated that complex architectures that are difficult to program often fall behind simpler ones that are easy to use.

 

Requirements

  • Basic background in computer architecture (undergraduate level).
  • Basic programming skills.
  • Some familiarity with the Roofline model, Mess framework and CPI stack is desirable but not mandatory.

Learning Outcomes

By the end of the course, participants will:

  • Gain historical and technological insight into why the memory wall has persisted for more than 30 years — and why it’s here to stay.
  • Understand the structure and behavior of modern memory systems and their impact on application performance.
  • Learn what to expect from key memory profiling tools — Roofline models, Mess framework, and CPI stacks — and how to interpret their results.
  • Receive pointers to advanced, publicly available materials for deeper exploration.
  • Be introduced to advanced topics in memory system design and efficient memory usage
Starts
Ends
CET
Vertex building
Room SV 208

For further details and practical info such as local transport and venue please visit the local course page: https://www.bsc.es/education/training/bsc-training

Registration
Registration for this event is currently open.