Big Data Analytics and Stream Processing on Apache Spark

Course/Event Essentials

Event/Course Start
Event/Course End
Event/Course Format
Mixed
Live (synchronous)

Venue Information

Country: Croatia
Venue Details: Click here

Training Content and Scope

Level of Instruction
Intermediate
Sector of the Target Audience
Research and Academia
Industry
Public Sector
HPC Profile of Target Audience
Application Developers
Data Scientists
Language of Instruction

Other Information

Organiser
Supporting Project(s)
EuroCC2/CASTIEL2
Event/Course Description

Companies collect enormous amounts of data about their customers, suppliers, and operations, while billions of connected devices on the Internet of things (IoT) and we as individuals additionally produce vast amounts of data. Therefore, we have witnessed an exponential growth in the amount of newly created data for more than a decade. We define Big Data as data that either contains greater variety, arrives in increasing volumes or is produced with higher velocity. Many different open-source platforms have been developed recently for dealing with mentioned challenges using cluster computing, such as the Apache Hadoop YARN (MapReduce2), Apache Spark, Apache Flink, Apache Storm, etc. Probably the most popular among them are the Apache Hadoop YARN and Apache Spark. This workshop briefly presents the Apache Spark platform and then demonstrates its analytics and stream processing capabilities. After that, during a hands-on session the attendants will learn to use the Apache Spark for processing both the unstructured and structured Big Data.

Please, use this link for registration: https://forms.gle/18yqEHikok4DNonr7