This two half-days course will be about parallel I/O with a special focus on portable data formats. It will introduce the use of the HDF5 and NetCDF (NetCDF4 and PnetCDF) library interfaces, and hands-on exercises (in C/C++ or Fortran) will allow to immediately test and understand their usage. Performance hints, optimization potential, and best practices for I/O will be discussed in detail throughout the whole course.
Numerical simulations conducted on current HPC systems face an ever growing need for scalability pushing the limitations on size and properties that can be accurately simulated. Therefore, ever larger data sets have to be processed, be it reading input data or writing results. Serial approaches on handling I/O in a parallel application will dominate the performance on massively parallel systems, leaving a lot of computing resources idle during those serial I/O phases.
In addition to the need for parallel I/O, input and output data is often processed on different and maybe even heterogeneous platforms. Conversion processes can impose a high level of maintenance when different data representations are needed. Portable, self-describing data formats such as HDF5 and netCDF can help to solve these problems.
Type of methodology: Combination of lecture and hands-on
Participants receive the certificate of attendance: If requested
Paid training activity for participants: Yes, for some only
Participants prerequisite knowledge: C/C++ OR Fortran