Name: Intro to Big Data and Hadoop
Availability: InStock

Intro to Big Data and Hadoop

Live Online (VILT) & Classroom Corporate Training Course

Given the ease with which it allows you to make sense of huge volumes of data and leverage frameworks to transform the same into actionable insights, training for Hadoop & Big Data are in great demand.

Expert-Led VILT & Classroom Hands-On CloudLabs Certification Voucher Available

Overview

This training course will help participants to gain the skills they need to store, manage, process, and analyze massive amounts of structured and unstructured data to extract meaningful insights.

Objectives

At the end of Intro to Big Data & Hadoop training course, participants will

Understand what Big Data is and gain in-depth knowledge of Big Data Analytics concepts and tools. | Learn to Process large data sets with Big Data tools to extract information from disparate sources. | Learn about MapReduce, Hadoop Distributed File System (HDFS), YARN, and how to write MapReduce code. | Learn best practices and considerations for Hadoop development as well as debugging techniques. | Learn how to use Hadoop frameworks like ApachePig™, ApacheHive™, Sqoop, Flume, among other projects. | Perform real-world analytics by learning advanced Hadoop API topics with an e-courseware. | Understand what Big Data is and gain in-depth knowledge of Big Data Analytics concepts and tools.
Learn to Process large data sets with Big Data tools to extract information from disparate sources.
Learn about MapReduce, Hadoop Distributed File System (HDFS), YARN, and how to write MapReduce code.
Learn best practices and considerations for Hadoop development as well as debugging techniques.
Learn how to use Hadoop frameworks like ApachePig™, ApacheHive™, Sqoop, Flume, among other projects.
Perform real-world analytics by learning advanced Hadoop API topics with an e-courseware.

Prerequisites

Before undertaking a Big Data and Hadoop course, participant is recommended to have a basic knowledge of programming languages like Python, Scala, Java and a better understanding of SQL and RDBMS.

Course Outline

Understanding Big Data
Types of Big Data
Difference between Traditional Data and Big Data
Introduction to Hadoop
Distributed Data Storage In Hadoop, HDFS and Hbase
Hadoop Data processing Analyzing Services MapReduce and spark, Hive Pig and Storm
Data Integration Tools in Hadoop
Resource Management and cluster management Services

Need of Hadoop in Big Data
Understanding Hadoop And Its Architecture
The MapReduce Framework
What is YARN?
Understanding Big Data Components
Monitoring, Management and Orchestration Components of Hadoop Ecosystem
Different Distributions of Hadoop
Installing Hadoop 3

Hortonworks sandbox installation & configuration
Hadoop Configuration files
Working with Hadoop services using Ambari
Hadoop Daemons
Browsing Hadoop UI consoles
Basic Hadoop Shell commands
Eclipse & winscp installation & configurations on VM

Running a MapReduce application in MR2
MapReduce Framework on YARN
Fault tolerance in YARN
Map, Reduce & Shuffle phases
Understanding Mapper, Reducer & Driver classes
Writing MapReduce WordCount program
Executing & monitoring a Map Reduce job

SparkSQL and DataFrames
DataFrames and the SQL API
DataFrame schema
Datasets and encoders
Loading and saving data
Aggregations
Joins

A short introduction to streaming
Spark Streaming
Discretized Streams
Stateful and stateless transformations
Checkpointing
Operating with other streaming platforms (such as Apache Kafka)
Structured Streaming

Background of Pig
Pig architecture
Pig Latin basics
Pig execution modes
Pig processing – loading and transforming data
Pig built-in functions
Filtering, grouping, sorting data
Relational join operators
Pig Scripting
Pig UDF’s

Background of Hive
Hive architecture
Hive Query Language
Derby to MySQL database
Managed & external tables
Data processing – loading data into tables
Hive Query Language
Using Hive built-in functions
Partitioning data using Hive
Bucketing data
Hive Scripting
Using Hive UDF’s

HBase overview
Data model
HBase architecture
HBase shell
Zookeeper & its role in HBase environment
HBase Shell environment
Creating table
Creating column families
CLI commands – get, put, delete & scan
Scan Filter operations

Importing data from RDBMS to HDFS
Exporting data from HDFS to RDBMS
Importing & exporting data between RDBMS & Hive tables

Overview of Oozie
Oozie Workflow Architecture
Creating workflows with Oozie
Introduction to Flume
Flume Architecture
Flume Demo

Introduction
Tableau
Chart types
Data visualization tools

Intro to Big Data and Hadoop

Overview

Objectives

Prerequisites

Course Outline

Why enterprise teams pick Synergific.

Live Hands-On Cloud Labs

Hire-Train-Deploy

Four-Hour Solutions SLA

Voucher-Backed Certification

Authorised Curriculum, Local Delivery

Pick the format that fits your team.

Live Online (VILT)

Classroom

Self-Paced

Blended

How a course becomes measurable skill.

Discover & set goals

Curate the right path

Deliver hands-on training

Assess & mentor

Certify & apply on the job

What our clients say

Intro to Big Data and Hadoop

Overview

Objectives

Prerequisites

Course Outline

Introduction

Big Data Ecosystem

Hadoop Cluster Configuration

Big Data Processing with MapReduce

Batch Analytics with Apache Spark

Real Time Analytics with Apache Spark

Analysis using Pig

Analysis using Hive Data Warehousing Infrastructure

Working with HBase

Importing and Exporting Data using Sqoop

Oozie Workflow Management and Using Flume for Analyzing Streaming Data

Visualizing Big Data

Why enterprise teams pick Synergific.

Live Hands-On Cloud Labs

Hire-Train-Deploy

Four-Hour Solutions SLA

Voucher-Backed Certification

Authorised Curriculum, Local Delivery

Pick the format that fits your team.

Live Online (VILT)

Classroom

Self-Paced

Blended

How a course becomes measurable skill.

Discover & set goals

Curate the right path

Deliver hands-on training

Assess & mentor

Certify & apply on the job

What our clients say