Course/Event Essentials
Training Content and Scope
Other Information
R is a highly popular and powerful programming language for data analysis and graphics, used in many research domains. The Leibniz Supercomputing Centre (LRZ) is addressing the needs of R users by facilitating various ways of working with R on LRZ systems.
For one it is hosting a RStudio Server web application as frontend to the LRZ AI Systems. This is an easy to use and powerful, interactive platform for data analytics, machine learning and AI projects. Additionally, R can be used directly on the high performance computing (HPC) systems operated by LRZ, the Linux Cluster and SuperMUC-NG.
In this course, the different possibilities of using R for data analytics, machine learning and AI projects at LRZ will be demonstrated and experienced in hands-on session. Guidelines and best practice examples for running R applications efficiently and productively on the various systems will be provided. Special attention will be paid to different ways of parallelizing R code in order to utilize various LRZ cluster systems. There will be breaks during the session.
There will be three content blocks of roughly one and a half hour each (B=Beginner's, I=Intermediate, A=Advanced content):
- The LRZ AI Systems and RStudio Server (B) / (I)
- RStudio Server for AI projects (B) / (I)
- R and the LRZ AI Systems: R package management and containerization (B) / (I)
- R on the LRZ Linux Cluster: environment modules, R package management (B) / (I)
- Slurm Workload Manager, interactive session, job processing (B) / (I)
- Parallelization Using R: Overview and resources (I)
- Pleasingly parallel workloads (B) / (I)
- Introduction to worker queue scenario/weak coupling (rredis/doRedis, batchtools, clustermq) (I) / (A)
- Shared memory parallelization (parallel/doParallel, foreach) (I) / (A)
- Message passing (rmpi/doMPI) (I) / (A)
- Futures/Promises (parallel, future, doFuture) (A)
- Workflow management (targets) (A)