Name: Big Data Hadoop Spark Developer (BDHS)
Availability: InStock

Big Data Hadoop Spark Developer (BDHS)

Live Online (VILT) & Classroom Corporate Training Course

The Big Data Hadoop training course will teach you the concepts of the Hadoop framework, its formation in a cluster environment, and prepares you for Cloudera's Big Data certification.

Expert-Led VILT & Classroom Hands-On CloudLabs Certification Voucher Available

Overview

With this Big Data Hadoop course, you will learn the big data framework using Hadoop and Spark, including HDFS, YARN, and MapReduce. The course will also cover Pig, Hive, and Impala to process and analyse large datasets stored in the HDFS and use Sqoop and Flume for data ingestion.

Objectives

At the end of BDHS training, participants will be able to understand:

The different components of Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
Hadoop Distributed File System (HDFS) and YARN architecture
MapReduce and its characteristics and assimilate advanced MapReduce concepts
Different types of file formats, Avro schema, using Avro with Hive, and Sqoop and Schema evolution
Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
HBase, its architecture and data storage, and learn the difference between HBase and RDBMS
The common use cases of Spark and various interactive algorithms

Prerequisites

There are no prerequisites for this course. However, it’s beneficial to have some knowledge of Core Java and SQL.

Course Outline

Apache Hadoop Overview
Data Processing
Introduction to the Hands-On Exercises

Apache Hadoop Cluster Components
HDFS Architecture
Using HDFS

YARN Architecture
Working With YARN

What is Apache Spark?
Starting the Spark Shell
Using the Spark Shell
Getting Started with Datasets and DataFrames
DataFrame Operations

Creating DataFrames from Data Sources
Saving DataFrames to Data Sources
DataFrame Schemas
Eager and Lazy Execution

Querying DataFrames Using Column Expressions
Grouping and Aggregation Queries
Joining DataFrames

RDD Overview
RDD Data Sources
Creating and Saving RDDs
RDD Operations

Writing and Passing Transformation Functions
Transformation Execution
Converting Between RDDs and DataFrames
Key-Value Pair RDDs
Map-Reduce
Other Pair RDD Operations

Datasets and DataFrames
Creating Datasets
Loading and Saving Datasets
Dataset Operations

Writing a Spark Application
Building and Running an Application
Application Deployment Mode
The Spark Application Web UI
Configuring Application Properties

Review: Apache Spark on a Cluster
RDD Partitions
Example: Partitioning in Queries
Stages and Tasks
Job Execution Planning
Example: Catalyst Execution Plan
Example: RDD Execution Plan

Apache Spark Streaming Overview
Creating Streaming DataFrames
Transforming DataFrames
Executing Streaming Queries
Receiving Kafka Messages
Sending Kafka Messages

Big Data Hadoop Spark Developer (BDHS)

Overview

Objectives

Prerequisites

Course Outline

Why enterprise teams pick Synergific.

Live Hands-On Cloud Labs

Hire-Train-Deploy

Four-Hour Solutions SLA

Voucher-Backed Certification

Authorised Curriculum, Local Delivery

Pick the format that fits your team.

Live Online (VILT)

Classroom

Self-Paced

Blended

How a course becomes measurable skill.

Discover & set goals

Curate the right path

Deliver hands-on training

Assess & mentor

Certify & apply on the job

What our clients say

Big Data Hadoop Spark Developer (BDHS)

Overview

Objectives

Prerequisites

Course Outline

Introduction to Apache Hadoop and the Hadoop Ecosystem

Apache Hadoop File Storage

Distributed Processing on an Apache Hadoop Cluster

Apache Spark Basics

Working with DataFrames and Schemas

Analyzing Data with DataFrame Queries

RDD Overview

Transforming & Aggregating Data with RDDs

Working with Datasets in Scala

Writing, Configuring, and Running Spark Applications

Spark Distributed Processing

Structured Streaming

Why enterprise teams pick Synergific.

Live Hands-On Cloud Labs

Hire-Train-Deploy

Four-Hour Solutions SLA

Voucher-Backed Certification

Authorised Curriculum, Local Delivery

Pick the format that fits your team.

Live Online (VILT)

Classroom

Self-Paced

Blended

How a course becomes measurable skill.

Discover & set goals

Curate the right path

Deliver hands-on training

Assess & mentor

Certify & apply on the job

What our clients say