Skip to main content

This hands-on tutorial introduces the Heat library, which is designed to scale Python-based array computing and data science workflows to distributed and GPU-accelerated environments. Heat offers a familiar NumPy-like API while distributing memory-intensive operations using PyTorch and mpi4py.

Topics covered include:

  • Heat Fundamentals: Get started with distributed arrays (DNDarrays), distributed I/O, data decomposition schemes, and array operations.
  • Key Functionalities: Explore the multi-node linear algebra, statistics, signal processing, and machine learning capabilities.
  • DIY Development: Learn how to use Heat's infrastructure to build your own multi-node, multi-GPU capable research applications.

 

Prerequisites:

Participants should have a laptop and experience with Python and its scientific ecosystem (e.g., NumPy, SciPy). A basic understanding of MPI is helpful but not required.

Target audience:

Researchers and Research Software Engineers (RSEs) working with large datasets that exceed the memory of a single machine. HPC practitioners who support these scientists or may be interested in contributing to the project are also welcome.

Language:

The course will be held in English.