Synergefic | Apache Spark and Scala

Apache Spark and Scala

Live Online (VILT) & Classroom Corporate Training Course

Apache Spark is a big data processing framework and its popularity lies in the fact that it is fast, easy to use and offers sophisticated solutions to data analysis. Its built-in modules for streaming, machine learning, SQL, and graph processing make it useful in diverse Industries.

Expert-Led VILT & Classroom Hands-On CloudLabs Certification Voucher Available

Overview

Apache Spark and Scala course is designed to help you become proficient in Apache Spark Development. You will learn about topics such as Apache Spark Core, Motivation for Apache Spark, Spark Internals, RDD, SparkSQL, Spark Streaming, MLlib, and GraphX that form key constituents of the Apache Spark course.

Objectives

At the end of Apache Spark & Scala training course, participants will

Master the concepts of the Apache Spark framework
Understand the Spark Internals RDD and use of Spark’s API and Scala functions to create RDDs and transform RDDs
Master the RDD Combiners, SparkSQL, Spark Context, Spark Streaming, MLlib, and GraphX

Prerequisites

Hadoop Basics

Course Outline

Overview of Hadoop
Architecture of HDFS & YARN
Overview of Spark version 2.2.0
Spark Architecture
Spark Components
Comparison of Spark & Hadoop
Installation of Spark v 2.2.0 on Linux 64 bit

Exploring the Spark shell
Creating Spark Context
Operations on Resilient Distributed Dataset – RDD
Transformations & Actions
Loading Data and Saving Data

Introduction to SQL Operations
SQL Context
Data Frame
Working with Hive
Loading Partitioned Tables
Processing CSV, Json ,Parquet files

Introduction to Scala
Feature of Scala
Scala vs Java Comparison
Data types
Data Structure
Arrays
Literals
Logical Operators
Mutable & Immutable variables
Type interface

Transforming data with Relational Operators

Oops vs Functions
Anonymous
Recursive
Call-by-name
Currying
Conditional statement

List
Map
Sets
Options
Tuples
Mutable collection
Immutable collection
Iterating
Filtering and counting
Group By
Flat Map
Word count
File Access

Classes, Objects & Properties
Inheritance

Maven build tool implementation
Build Libraries
Create Jar files
Spark-Submit

Overview of Spark Streaming
Architecture of Spark Streaming
File streaming
Twitter Streaming

Overview of Kafka Streaming
Architecture of Kafka Streaming
Kafka Installation
Topic
Producer
Consumer
File streaming
Twitter Streaming

Overview of Machine Learning Algorithm
Linear Regression
Logistic Regression

GraphX overview
Vertices
Edges
Triplets
Page Rank
Pregel

On-Off-heap memory tuning
Kryo Serialization
Broadcast Variable
Accumulator Variable
DAG Scheduler
Data Locality
Check Pointing
Speculative Execution
Garbage Collection

Master – Driver Node capacity
Slave – Worker Node capacity
Executor capacity
Executor core capacity
Project scenario and execution
Out-of-memory error handling
Master logs, Worker logs, Driver logs
Monitoring Web UI
Heap memory dump

Testimonials

Synergific Software Team has been very supportive, and working with them has been a best decision that we could ever made, They are just a call away. You guys are AWESOME, Thank You, Keep up the Good Work!!!

Shamsudeen Bawa

Vice President, J.P Morgan, CIS, USA

Synergific Software has been of great help and I plan to continue to use your services in the future for my business needs.

Farhan Hafiz

Data Architect, Fiserv

I think Synergific Software is great. I liked that it was hassle free and easy to set up. Again, it's a great feature for a fast and cheap set up, which gives me peace of mind, as I know have a terms of use agreement.

Dr. Sahdev Singh

Under Secretary, Ministry of Law & Justice, Govt. of India

I liked using Synergific Software very much. I thought the website was easy to navigate and the instructions for generating the terms was clear. I even recommended you on a Facebook Group I am a member of.

M Chikanna Swamy

Director & Learning Head, Mindtree

Apache Spark and Scala

Overview

Objectives

Prerequisites

Course Outline

Transforming data with Relational Operators

Testimonials

The Synergific Training Advantage

Expert Instructors

Hands-On CloudLabs

Certification Vouchers

Flexible Scheduling

Frequently Asked Questions

Apache Spark and Scala

Overview

Objectives

Prerequisites

Course Outline

Introduction

Spark Core

Spark SQL & Hive SQL

Scala Programming

Scala Functions

Transforming data with Relational Operators

Scala Collections

Scala Object Oriented Programming

Spark Submit

Spark Streaming

Kafka Streaming

Spark Mlib

Spark GraphX

Performance Tuning

Project Planning, Monitoring & Trouble Shooting

Testimonials

The Synergific Training Advantage

Expert Instructors

Hands-On CloudLabs

Certification Vouchers

Flexible Scheduling

Frequently Asked Questions