Machine Learning on HPC systems (1 day)

Course/Event Essentials

Event/Course Start
Event/Course End
Event/Course Format
In person
Live (synchronous)

Venue Information

Country: Netherlands
Venue Details: Click here

Training Content and Scope

Scientific Domain
Level of Instruction
Beginner
Intermediate
Sector of the Target Audience
Research and Academia
Industry
Public Sector
HPC Profile of Target Audience
Application Users
Application Developers
Data Scientists
System Administrators
Language of Instruction

Other Information

Organiser
Event/Course Description

You want to train a neural network for your research project, and have just gotten access to a high performance cluster with a lot of powerful hardware. Great! But, how can you make sure that you’ll use these (expensive!) resources effectively? In this course, you will learn how to get the most results out of the computational budget you were granted.

 

In this course you will learn:

  • How to set up your software environment, and why the preinstalled software modules are useful;
  • How the file I/O might limit your training speed, and how to overcome that;
  • About the technical capabilities of modern day CPUs and GPUs (reduced precision datatypes, vector/matrix instructions);
  • How to find bottlenecks in your code through creating a (PyTorch) profile;
  • How to use multiple CPUs or GPUs in a single training (parallel computing for deep learning).

Who?

Machine Learning researchers whose requirements for training their neural networks have outgrown their local computer, and are using or planning to use a high performance computing cluster (such as Snellius) to train their models.

Prerequisites

  • Basic knowledge in PyTorch, TensorFlow or a similar framework;
  • Basic knowledge on Python programming. Some experience in using Jupyter notebooks is desireable, but not essential;
  • Basic knowledge in using a high performance computing cluster (see our course ‘Introduction to cluster and supercomputing);
  • Specifically: know how to submit a job, and how to interact with the module environment.

Sign up: https://events.surf.nl/kort4/open/8e6e2802-7165-479f-ae4e-be242555961e  *Please note, registration closes May 15!