Hadoop Administration

Live Online (VILT) & Classroom Corporate Training Course

Hadoop Administration introduces you to the fundamental concepts of Apache Hadoop and Hadoop cluster.

How can we help you?

  • CloudLabs
  • Projects
  • Assignments
  • 24x7 Support
    24x7 Support
  • Lifetime Access
    Lifetime Access


Hadoop Administration training introduces you to the fundamental concepts of Apache Hadoop and Hadoop cluster. Through hands on exercises and practice sessions you will learn to configure, deploy and maintain a Hadoop cluster, and to confidently navigate the Hadoop ecosystem.



At the end of Hadoop Administration training course, participants will

  • Implement and manage the ongoing administration of a Hadoop cluster
  • Build powerful applications to analyse Big Data and learn to manage and monitor the Hadoop cluster
  • Ensure performance tuning of Hadoop clusters and Hadoop MapReduce routines


Basic knowledge of Linux


Course Outline

  • Hadoop cluster architecture
  • Data loading into HDFS
  • Roles and Responsibilities of a Hadoop Cluster Administrator

  • Hadoop server roles and their usage
  • Rack awareness
  • Write and Read
  • Replication Pipeline
  • Data Processing
  • Hadoop Installation and Initial Configuration
  • Deploying Hadoop in pseudo-distributed mode
  • Deploying a multi-node Hadoop cluster
  • Installing Hadoop Clients

  • Selecting the appropriate hardware
  • Designing a scalable cluster
  • Building the cluster
    • Installing the Hadoop daemons
    • Optimizing the network architecture
  • Managing and scheduling jobs
  • Types of schedulers in Hadoop
  • Configuring the schedulers and run MapReduce jobs
  • Cluster monitoring and troubleshooting

  • How to manage hardware failures
  • Securing Hadoop clusters
  • Configuring Hadoop backup
  • Distcp to copy data
  • Cluster maintenance
  • Configuring HDFS Federation
  • Basics of Hadoop Platform Security
  • Securing the Platform
  • Configuring Kerberos

  • Isolating single points of failure
  • Maintaining High Availability
  • Triggering manual failover
  • Automating failover with Zookeeper
  • Extending HDFS resources
  • Managing the namespace volumes
  • Critiquing the YARN architecture
  • Identifying the new daemons

  • Starting and stopping Hadoop daemonso Monitoring HDFS status
  • Adding and removing data nodes
  • Managing MapReduce jobs
  • Tracking progress with monitoring tools
  • Commissioning and decommissioning compute nodes

  • Oozie
  • Hcatalog/Hive Administration
  • HBase Architecture
  • HBase setup
  • HBase and Hive Integration
  • HBase performance optimization