Course/Event Essentials
Training Content and Scope
Other Information
Supervised training of large networks requires large labeled datasets, which in turn demand high computational costs. While active practitioners in deep learning primarily develop and train their networks on local computing devices, with the increase of networks complexity, there is an urgent need to create, train, and test models on clusters.
In this workshop, we overview the basics of Docker and Singularity. (Working knowledge of Singularity as given in the Uppmax workshop on Singularity is desirable.) Distributed training using TensorFlow and Horovod frameworks on a supercomputer will be covered. Moreover, it will be shown how to use Singularity containers in conjunction with TensorFlow and Horovod to upscale an AI app.
The workshop will be entirely online using zoom.