Course/Event Essentials

Event/Course Start

Thu, 09/11/2023 - 09:00 CET

Event/Course End

Thu, 09/11/2023 - 13:00 CET

Event/Course Format

In person

Live (synchronous)

Primary Event/Course URL

https://eurocc-netherlands.nl/calendar/…

Venue Information

Country: Netherlands
Venue Details: Click here

Training Content and Scope

Scientific Domain

Not scientific domain specific

Technical Domain

Not technical domain specific

Level of Instruction

Beginner

Intermediate

Advanced

Other

Sector of the Target Audience

Research and Academia

Industry

Public Sector

Other (general public...)

HPC Profile of Target Audience

Application Users

Application Developers

Data Scientists

System Administrators

Language of Instruction

English

Other Information

Organiser

SURF

Supporting Project(s)

EuroCC2/CASTIEL2

Event/Course Description

If you need to perform many calculations, or analyses that are too big for your own system, clusters and supercomputers provide the computing power you need. In this course, you will learn to work with the national supercomputer Snellius.

Sign Up

What will you learn in this course?

This course is a continuation of the first introduction course to Supercomputing, where you can take a deeper dive in the use of supercomputers with some particular focus in efficiency and good practices and an eminently practical approach.

The outline of this sesión includes the following modules:

Fundamentals of performance analysis. This technical introductory presentation introduces hybrid high-performance systems, abstractly covering the system’s architecture and configuration. Our goal is to enhance the understanding of HPC complexity before delving into the importance of performance analysis models. Special focus will be given to the Roofline model.
- Abstract modeling of hybrid supercomputers. To present an abstract modeling approach for hybrid supercomputers, condensing their complexity into three core parameters: peak performance, memory, and network bandwidth.
- Performance analysis. To explore performance analysis, starting with an overview of various models and delving into the specifics of the roofline model.
- The Roofline model. To describe the roofline model and present its practical application through clear explanations and demonstrations.
File systems. This practical session covers the correct usage of file systems on HPC systems, especially on Snellius.
Slurm hybrid jobs. Slurm, a prevalent job scheduler for High-Performance Computing (HPC) systems, has been introduced in previous sections for fundamental understanding. This module advances the specifics of resource allocation parameters for hybrid shared- and distributed-memory jobs.
- Nodes, cores, and tasks. This segment will delve into the fundamental concepts of nodes, cores, and tasks, shedding light on their roles within the context of HPC systems.
- Bindings. The concept of bindings will be explored, providing insights into how tasks are linked to specific resources enhancing participants’ understanding of resource allocation mechanisms.
- Hands on. We will execute the vector addition kernel with multiple configurations using a set of scripts.
QCG Pilotjob. In some cases, users have to execute a large amount of lightweight cases. However, supercomputer’s nodes are too powerful and allow only relatively big partitions. For instance, the smallest allocation possible on Snellius is 1/4 of a node: 32 cores and 64 GB. Job concurrency is a common strategy to efficiently launch multiple light jobs on such big partitions.
- Fundamentals of job concurrency. This segment delves into the foundational principles underlying job concurrency. Job concurrency is a methodological approach that enables the simultaneous execution of multiple smaller jobs within a larger allocated partition. The objective is to optimize resource utilization and enhance efficiency in scenarios where lighter tasks are executed on nodes designed for heavier workloads.
- Hands on QCG PilotJob. This practical session provides participants with hands-on experience working with the QCG Pilotjob framework. Participants will gain practical insights into the strategies and techniques of utilizing job concurrency to launch and manage multiple lightweight jobs within the context of sizable node partitions.

Prerequisites

Participation in the course Introduction to Supercomputing, Part I

The language of instruction is English

Location

This course takes place at the VU Campus
De Boelelaan 1081, Amsterdam – Room WN-C203/C255

View the floor plan here

Introduction Training Supercomputing Part 2