Course/Event Essentials
Training Content and Scope
Other Information
In the ever-growing complexity of computer architectures, code optimisation has become the main route to keep pace with hardware advancements and effectively make use of current and upcoming High Performance Computing systems.
Have you ever asked yourself:
- Where are the performance bottlenecks of my application?
- What is the maximum speed-up achievable on the architecture I am using?
- Does my code scale well across multiple machines?
- Does my implementation match my HPC objectives?
In this workshop, we will discuss these questions and provide a unique opportunity to learn techniques, methods and solutions on how to improve code, how to enable the new hardware features and how to use visualise the potential benefits of an optimisation process.
We will describe the latest micro-processor architectures and how developers can efficiently use modern HPC hardware, including SIMD vector units and the memory hierarchy. We will also touch upon exploiting intra-node and inter-node parallelism.
Attendees will be guided along the optimisation process through the incremental improvement of an example application. Through hands-on exercises they will learn how to enable vectorisation using simple pragmas and more effective techniques like changing data layout and alignment.
The work is guided by hints from compiler reports, and profiling tools such as Intel® Advisor, Intel® VTune™ Amplifier, Intel® Application Performance Snapshot and LIKWID for investigating and improving the performance of an HPC application.
You can ask the lecturers in the Q&A session about how to optimise your code. Please provide a description of your code in the registration form.
Learning Goals
Through a sequence of simple, guided examples of code modernisation, the attendees will develop awareness on features of multi and many-core architecture which are crucial for writing modern, portable and efficient applications.
A special focus will be dedicated to scalar and vector optimisations for the Intel® Xeon® Scalable processor, code-named Skylake, utilised in the SuperMUC-NG machine at LRZ.
The workshop interleaves lecture and practical sessions.