Synergefic | Intro to Big Data and Hadoop

Overview

This training course will help participants to gain the skills they need to store, manage, process, and analyze massive amounts of structured and unstructured data to extract meaningful insights.

Objectives

At the end of Intro to Big Data & Hadoop training course, participants will

Understand what Big Data is and gain in-depth knowledge of Big Data Analytics concepts and tools.
Learn to Process large data sets with Big Data tools to extract information from disparate sources.
Learn about MapReduce, Hadoop Distributed File System (HDFS), YARN, and how to write MapReduce code.
Learn best practices and considerations for Hadoop development as well as debugging techniques.
Learn how to use Hadoop frameworks like ApachePig™, ApacheHive™, Sqoop, Flume, among other projects.
Perform real-world analytics by learning advanced Hadoop API topics with an e-courseware.

Prerequisites

Before undertaking a Big Data and Hadoop course, participant is recommended to have a basic knowledge of programming languages like Python, Scala, Java and a better understanding of SQL and RDBMS.

Course Outline

Understanding Big Data
Types of Big Data
Difference between Traditional Data and Big Data
Introduction to Hadoop
Distributed Data Storage In Hadoop, HDFS and Hbase
Hadoop Data processing Analyzing Services MapReduce and spark, Hive Pig and Storm
Data Integration Tools in Hadoop
Resource Management and cluster management Services

Need of Hadoop in Big Data
Understanding Hadoop And Its Architecture
The MapReduce Framework
What is YARN?
Understanding Big Data Components
Monitoring, Management and Orchestration Components of Hadoop Ecosystem
Different Distributions of Hadoop
Installing Hadoop 3

Hortonworks sandbox installation & configuration
Hadoop Configuration files
Working with Hadoop services using Ambari
Hadoop Daemons
Browsing Hadoop UI consoles
Basic Hadoop Shell commands
Eclipse & winscp installation & configurations on VM

Running a MapReduce application in MR2
MapReduce Framework on YARN
Fault tolerance in YARN
Map, Reduce & Shuffle phases
Understanding Mapper, Reducer & Driver classes
Writing MapReduce WordCount program
Executing & monitoring a Map Reduce job

SparkSQL and DataFrames
DataFrames and the SQL API
DataFrame schema
Datasets and encoders
Loading and saving data
Aggregations
Joins

A short introduction to streaming
Spark Streaming
Discretized Streams
Stateful and stateless transformations
Checkpointing
Operating with other streaming platforms (such as Apache Kafka)
Structured Streaming

Background of Pig
Pig architecture
Pig Latin basics
Pig execution modes
Pig processing – loading and transforming data
Pig built-in functions
Filtering, grouping, sorting data
Relational join operators
Pig Scripting
Pig UDF’s

Background of Hive
Hive architecture
Hive Query Language
Derby to MySQL database
Managed & external tables
Data processing – loading data into tables
Hive Query Language
Using Hive built-in functions
Partitioning data using Hive
Bucketing data
Hive Scripting
Using Hive UDF’s

HBase overview
Data model
HBase architecture
HBase shell
Zookeeper & its role in HBase environment
HBase Shell environment
Creating table
Creating column families
CLI commands – get, put, delete & scan
Scan Filter operations

Importing data from RDBMS to HDFS
Exporting data from HDFS to RDBMS
Importing & exporting data between RDBMS & Hive tables

Overview of Oozie
Oozie Workflow Architecture
Creating workflows with Oozie
Introduction to Flume
Flume Architecture
Flume Demo

Introduction
Tableau
Chart types
Data visualization tools

Testimonials

Synergific Software Team has been very supportive, and working with them has been a best decision that we could ever made, They are just a call away. You guys are AWESOME, Thank You, Keep up the Good Work!!!

Shamsudeen Bawa

Vice President, J.P Morgan, CIS, USA

Synergific Software has been of great help and I plan to continue to use your services in the future for my business needs.

Farhan Hafiz

Data Architect, Fiserv

I think Synergific Software is great. I liked that it was hassle free and easy to set up. Again, it's a great feature for a fast and cheap set up, which gives me peace of mind, as I know have a terms of use agreement.

Dr. Sahdev Singh

Under Secretary, Ministry of Law & Justice, Govt. of India

I liked using Synergific Software very much. I thought the website was easy to navigate and the instructions for generating the terms was clear. I even recommended you on a Facebook Group I am a member of.

M Chikanna Swamy

Director & Learning Head, Mindtree

Intro to Big Data and Hadoop

Live Online (VILT) & Classroom Corporate Training Course

Given the ease with which it allows you to make sense of huge volumes of data and leverage frameworks to transform the same into actionable insights, training for Hadoop & Big Data are in great demand.

How can we help you?

CloudLabs

Projects

Assignments

24x7 Support

Lifetime Access

Overview

Objectives

Prerequisites

Course Outline

Testimonials

Intro to Big Data and Hadoop

Live Online (VILT) & Classroom Corporate Training Course

Given the ease with which it allows you to make sense of huge volumes of data and leverage frameworks to transform the same into actionable insights, training for Hadoop & Big Data are in great demand.

How can we help you?

CloudLabs

Projects

Assignments

24x7 Support

Lifetime Access

Overview

Objectives

Prerequisites

Course Outline

Introduction

Big Data Ecosystem

Hadoop Cluster Configuration

Big Data Processing with MapReduce

Batch Analytics with Apache Spark

Real Time Analytics with Apache Spark

Analysis using Pig

Analysis using Hive Data Warehousing Infrastructure

Working with HBase

Importing and Exporting Data using Sqoop

Oozie Workflow Management and Using Flume for Analyzing Streaming Data

Visualizing Big Data

Testimonials