AI Training Series - Introduction to the LRZ AI Systems

Course/Event Essentials

Event/Course Start
Event/Course End
Event/Course Format
Mixed
Live (synchronous)

Venue Information

Country: Germany
Venue Details: Click here

Training Content and Scope

Scientific Domain
Level of Instruction
Beginner
Intermediate
Sector of the Target Audience
Research and Academia
Language of Instruction

Other Information

Organiser
Event/Course Description

The aim of this course is to give an overview of the LRZ AI Systems, and provide participants with the knowledge and skills necessary to efficiently utilise them. The course consists of mini lectures, demos and hands on sessions (breaks included) covering the following topics:

  • Resources overview of the LRZ AI Systems

  • Fundamentals of Deep Learning
  • Distributed Training of Neural Networks

Three blocks of content, devoting roughly an hour each to the first two, and two and a half hours to the third one (B=Beginner's, I=Intermediate, A=Advanced content):

  • Overview of LRZ AI Systems (1h)
    • Hardware overview (B)
    • Access mode for the different resources (B)
    • Execution Mode (software stack) (B) + (I)
  • Fundamentals of Deep Learning (1h)
    • Introduction to Neural Networks (B)
    • Training Neural Networks (B)
    • Introduction to Convolutional Neural Networks (B)
    • Introduction to Transformers (B)
    • Exercises: Training Convolutional Neural Networks and Transformers on GPUs (I)
  • Distributed Deep Learning Training Part I (1h)
    • Motivation for Distributed Deep Learning Training (B)
    • Overview of Techniques for Distributed Deep Learning Training (B)
  • Distributed Deep Learning Training Part II (1.5h)
    • Data Parallelism (I)
    • Exercise: Data Parallelism with Horovod (I)
    • Exercise: Data Parallelism with Pytorch Distributed Data Parallel (I)  
    • Model Parallelism - Pipeline Parallelism and Tensor Parallelism (A)
    • Exercise: Pipeline Parallelism with Pytorch Pipe (A)