Data science with R and Python

Course/Event Essentials

Event/Course Start
Event/Course End
Event/Course Format
Online
Live (synchronous)

Venue Information

Country: Czech Republic
Venue Details: Click here

Training Content and Scope

Scientific Domain
Level of Instruction
Beginner
Intermediate
Sector of the Target Audience
Research and Academia
Industry
Public Sector
HPC Profile of Target Audience
Data Scientists
Language of Instruction

Other Information

Supporting Project(s)
PRACE
Event/Course Description

The R part of the course will be focused on presenting the basics of exploratory data analysis in R, as well as presentation of the findings through visualization, and basics of statistical/machine learning modelling. The course will cover the basic workflow of exploratory analysis using packages from the 'tidyverse' universe. These includes packages for the loading of data, preprocessing data, basic data exploration, and visualization. In the second part, we will work on the basics of modelling in R starting with data preparation (missing data handling, one-hot enconding, etc.), model training, and model evaluation. In this part the main tools will be packages 'caret' and 'xgboost'.

The Python oriented part will introduce essential data-scientific packages that will demonstrate their usage with real world data analytic problems, and showing how to tackle such problems.