Synergefic | Apache Pig & Hive

Live Online (VILT) & Classroom Corporate Training Course

Apache Pig is known for its simplistic syntax and ability to decrease development time and hence is widely used by organizations that analyse Big Data. The Hive tool in the Hadoop ecosystem is much sought after because it is scalable and provides tools for easy data analysis and extraction.

Overview

This training will introduce you to the world of Hadoop and MapReduce. You will learn through a series of practical, hands on exercises on writing complex MapReduce transformations, about HDFSand writing scripts using the advanced features of Pig. You will understand the Hive environment, the Hive querying language and how to perform data analysis with Hive.

Objectives

At the end of Apache Pig & Hive training course, participants will learn

How Big data can change the way businesses operate
The Hadoop ecosystem and its architecture
To analyse large data sets using Pig Latins scripts and parallel processing using MapReduce
About Hive and its use in Big Data
The benefits of HiveQL
To use Hive on complex data sets and derive insights to help business

Prerequisites

Understanding of Linux commands and SQL queries
Basic Knowledge of core Java

Course Outline

Hadoop overview
Surveying the Hadoop components
Defining the Hadoop architecture

Achieving reliable and secure storage
Monitoring storage metrics
Controlling HDFS from the Command Line

Contrasting Pig with MapReduce
Identifying Pig use cases
Pinpointing key Pig configurations

Pig Latin: Relational Operators
File Loaders
Group Operator
CO GROUP Operator
Joins and CO GROUP
Union, Diagnostic Operators
Pig UDF

Transforming data with Relational Operators

Creating new relations with joins
Reducing data size by sampling
Extending Pig with user–defined functions

Transforming data with Relational Operators

Creating new relations with joins
Reducing data size by sampling
Extending Pig with user–defined functions

Filtering data with Pig

Consolidating data sets with unions
Partitioning data sets with splits
Injecting parameters into Pig scripts

Transforming data with Relational Operators

Hive Background
Hive Use Case
About Hive
Hive vs Pig
Hive Architecture and Components
Meta-store in Hive
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Partitions and Buckets
Hive Tables(Managed Tables and External Tables)
Importing Data
Querying Data
Managing Outputs

Transforming data with Relational Operators

Hive Script
Hive UDF and Hive Demo on Healthcare Data set
Hive QL: Joining Tables
Dynamic Partitioning
Custom MapReduce Scripts
Thrift Server
User Defined Functions

Testimonials

Synergific Software Team has been very supportive, and working with them has been a best decision that we could ever made, They are just a call away. You guys are AWESOME, Thank You, Keep up the Good Work!!!

Shamsudeen Bawa

Vice President, J.P Morgan, CIS, USA

Synergific Software has been of great help and I plan to continue to use your services in the future for my business needs.

Farhan Hafiz

Data Architect, Fiserv

I think Synergific Software is great. I liked that it was hassle free and easy to set up. Again, it's a great feature for a fast and cheap set up, which gives me peace of mind, as I know have a terms of use agreement.

Dr. Sahdev Singh

Under Secretary, Ministry of Law & Justice, Govt. of India

I liked using Synergific Software very much. I thought the website was easy to navigate and the instructions for generating the terms was clear. I even recommended you on a Facebook Group I am a member of.

M Chikanna Swamy

Director & Learning Head, Mindtree

Apache Pig & Hive

Live Online (VILT) & Classroom Corporate Training Course

Apache Pig is known for its simplistic syntax and ability to decrease development time and hence is widely used by organizations that analyse Big Data. The Hive tool in the Hadoop ecosystem is much sought after because it is scalable and provides tools for easy data analysis and extraction.

How can we help you?

CloudLabs

Projects

Assignments

24x7 Support

Lifetime Access

Overview

Objectives

Prerequisites

Course Outline

Transforming data with Relational Operators

Transforming data with Relational Operators

Filtering data with Pig

Transforming data with Relational Operators

Transforming data with Relational Operators

Testimonials

Apache Pig & Hive

Live Online (VILT) & Classroom Corporate Training Course

Apache Pig is known for its simplistic syntax and ability to decrease development time and hence is widely used by organizations that analyse Big Data. The Hive tool in the Hadoop ecosystem is much sought after because it is scalable and provides tools for easy data analysis and extraction.

How can we help you?

CloudLabs

Projects

Assignments

24x7 Support

Lifetime Access

Overview

Objectives

Prerequisites

Course Outline

The Hadoop Ecosystem

Exploring HDFS and MapReduce

Executing Data Flows with Pig

Advanced Pig

Performing ETL with Pig

Transforming data with Relational Operators

Performing ETL with Pig

Transforming data with Relational Operators

Filtering data with Pig

Hive

Transforming data with Relational Operators

Advanced Hive

Transforming data with Relational Operators

Testimonials