Data Science
The student will acquire knowledge about the five basic data analysis parts: data wrangling, clearning and sampling, data management, data analysis, prediction using statistical methods and data visualization.
The student will acquire knowledge about the five basic data analysis parts: data wrangling, clearning and sampling, data management, data analysis, prediction using statistical methods and data visualization.
By completing the course the student will be able to understand complex distributed and parallel systems and develop distributed applications including their optimisation.
The course introduces students to the theory and application of data visualization. Upon completion the student will be able to demonstrate knowledge of the concept of data visualization, choose and implement visualisation algorithms and use data visualization tools.
The goal of the course is to complete the knowledge of students in the field of intelligent systems, starting from pre-processing data to validation of the built system. Students will be able to build an intelligent system from start to finish on real domain specific problems.
The course will introduce the students data mining and machine learning algorithms for analyzing massive amounts of data. The emphasis will be on the distributed platforms and Map Reduce as a tool for creating parallel algorithms that can process large amounts of data.
The course introduces the development of middleware systems. Principles of distributed systems, communication, processing, naming, consistency, replication, security. Application of these concepts in different distributed systems.
Introduction to methods for identifying valid, novel, useful, and understandable patterns in data. Data preprocessing Induction of predictive models from data: classification, regression, probability estimation. Discovery of clusters and association rules.
Research of the algorithms and programming techniques of the newest parallel platforms with shared and distributed memory. The student will learn the theoretical and practical (programmatical) components.
Introduction to virtualization as a paradigm for creation of virtual computer systems using software virtualization of hardware components. Analysis of different aspects of virtualization, technologies and techniques included in the process, as well as advantages and disadvantages introduces by using virtualization.
The aim of the course is for the students to become familiar with the basics of modern machine learning techniques.