Course/Event Essentials
Training Content and Scope
Other Information
Companies collect enormous amounts of data about their customers, suppliers, and operations, while billions of connected devices on the Internet of things (IoT) and we as individuals additionally produce vast amounts of data. Therefore, we have witnessed an exponential growth in the amount of newly created data for more than a decade. We define Big Data as data that either contains greater variety, arrives in increasing volumes or is produced with higher velocity. Many different open-source platforms have been developed recently for dealing with mentioned challenges using cluster computing, such as the Apache Hadoop YARN (MapReduce2), Apache Spark, Apache Flink, Apache Storm, etc. Probably the most popular among them are the Apache Hadoop YARN and Apache Spark. This workshop briefly presents the Apache Spark platform and then demonstrates its analytics and stream processing capabilities. After that, during a hands-on session the attendants will learn to use the Apache Spark for processing both the unstructured and structured Big Data.
Please, use this link for registration: https://forms.gle/18yqEHikok4DNonr7