Course/Event Essentials
Training Content and Scope
Other Information
In order to increase the efficiency of our users' applications on Hawk, HLRS, HPE and AMD offer this workshop to enhance the node-level performance, I/O and scaling of the codes utilized by participants. By doing so, users can raise the volume as well as quality of their scientific findings while the costs (in terms of core hours) remain constant.
By means of this workshop, you can tweak compiler flags, environment settings, etc. in order to make your code run faster. According to our experience gathered in prior workshops, these “low-hanging fruits” can give you a significant speedup but require only little effort.
Furthermore, you will analyze the runtime behavior of your code, locate bottlenecks, design and discuss potential solutions as well as implement them. All categories of bottlenecks (CPU, memory subsystem, communication and I/O) will be addressed, according to the respective requirements. Ideally, the above steps will be repeated multiple times in order to address several bottlenecks.
In addition, an introduction on how to use Ray on Hawk-AI nodes to scale Python and AI tasks will be given. A real-world example that combines these with HPC simulations as well as a use case featuring distributed data preparation, training, and hyperparameter optimization, will give you an idea of what you could do within your
project.
Every attending group will be supported during the entire workshop by a dedicated member of the HLRS user support group. In addition, HPE and AMD specialists will be available to help with issues specific to e.g. MPI and the processor.
To make it easy for you to attend, we decided to provide this workshop in a hybrid fashion. Besides meeting in person at HLRS, we will hence also setup breakout rooms in a Zoom session which enable remote participants to communicate as well as share screens and remote control applications with support staff, hence providing the same options of interaction as meeting in person.