Information Request












 

Hadoop Administration CertCamp

 

Duration: 4 days

Fee: $Call us for individual or group pricing

Hadoop Administration

The Hadoop Cluster Administration training course is designed to provide knowledge and skills to become a successful Hadoop Architect. It starts with the fundamental concepts of Apache Hadoop and Hadoop Cluster. It covers topics to deploy, manage, monitor, and secure a Hadoop Cluster. You will learn to configure backup options, diagnose and recover node failures in a Hadoop Cluster. The course will also cover HBase Administration. There will be many challenging, practical and focused hands-on exercises for the learners. Software professionals new to Hadoop can quickly learn the cluster administration through technical sessions and hands-on labs. By the end of this Hadoop Cluster Administration CertCamp, you will be prepared to understand and solve real world problems that you may come across while working on Hadoop Cluster.

Course Objectives

After the completion of ‘Hadoop Administration’ course at CertFirst, you should be able to:

1. Get a clear understanding of Apache Hadoop, HDFS, Hadoop Cluster and Hadoop Administration.

2. Hadoop 2.0, Name Node High Availability, HDFS Federation, YARN, MapReduce v2.

3. Plan and Deploy a Hadoop Cluster.

4. Load Data and Run Applications.

5. Configuration and Performance Tuning.

6. Manage, Maintain, Monitor and Troubleshoot a Hadoop Cluster.

7. Secure a deployment and understand Backup and Recovery.

8. Learn what Oozie, Hcatalog/Hive, and HBase Administration is all about.

 

Who should go for this course?

Students, DBAs, System Administrators, Software Architects, Data Warehouse Professionals, IT Managers, and Software Developers interested in learning Hadoop Cluster Administration should go for this course.


Pre-requisites

This course assumes no prior knowledge of Apache Hadoop and Hadoop Cluster Administration. Good knowledge of

Linux is required as Hadoop runs on Linux. Fundamental Linux system administration skills such as Linux scripting (Perl/bash), good troubleshooting skills, understanding of system’s capacity, bottlenecks, basics of memory, CPU, OS, storage, and networks are preferable.

 

Course Curriculum:

 

1. > Hadoop Cluster Administration

Learning Objectives– In this module, you will understand what is Big Data and Apache Hadoop, How Hadoop solves the Big Data problems, Hadoop Cluster Architecture, Introduction to MapReduce framework, Hadoop Data Loading techniques, and Role of a Hadoop Cluster Administrator.

 

Topics– Introduction to Big Data, Hadoop Architecture, MapReduce Framework, A typical Hadoop Cluster, Data Loading into HDFS, Hadoop Cluster Administrator: Roles and Responsibilities

 

2. > Hadoop Architecture and Cluster setup

Learning Objectives– After this module, you will understand Multiple Hadoop Server roles such as NameNode and DataNode, and MapReduce data processing. You will also understand the Hadoop 1.0 Cluster setup and configuration, Setting up Hadoop Clients using Hadoop 1.0, and important Hadoop configuration files and parameters.

Topics– Hadoop server roles and their usage, Rack Awareness, Anatomy of Write and Read, Replication Pipeline, Data Processing, Hadoop Installation and Initial Configuration, Deploying Hadoop in pseudo-distributed mode, deploying a multi-node Hadoop cluster, Installing Hadoop Clients

 

3. > Hadoop Cluster: Planning and Managing

Learning Objectives– In this module, you will understand Planning and Managing a Hadoop Cluster, Hadoop Cluster Monitoring and Troubleshooting, Analyzing logs, and Auditing. You will also understand Scheduling and Executing MapReduce Jobs, and different Schedulers.

Topics– Planning the Hadoop Cluster, Cluster Size, Hardware and Software considerations, Managing and Scheduling Jobs, types of schedulers in Hadoop, Configuring the schedulers and run MapReduce jobs, Cluster Monitoring and Troubleshooting.

 

4. > Backup, Recovery and Maintenance

Learning Objectives– In this module, you will understand day to day Cluster Administration tasks such as adding and Removing Data Nodes, NameNode recovery, configuring Backup and Recovery in Hadoop, Diagnosing the Node Failures in the Cluster, Hadoop Upgrade etc.

Topics– Configure Rack awareness, Setting up Hadoop Backup, whitelist and blacklist data nodes in a cluster, setup quota’s, upgrade Hadoop cluster, copy data across clusters using distcp, Diagnostics and Recovery, Cluster Maintenance.

 

5. > Hadoop 2.0 and High Availability

Learning Objectives– In this module, you will understand Secondary NameNode setup and check pointing, Hadoop 2.0 New Features, HDFS High Availability, YARN framework, MRv2, and Hadoop 2.0 Cluster setup in pseudo- distributed and distributed mode.

Topics– Configuring Secondary NameNode, Hadoop 2.0, YARN framework, MRv2, Hadoop 2.0 Cluster setup, Deploying Hadoop 2.0 in pseudo-distributed mode, deploying a multi-node Hadoop 2.0 cluster.

 

6. > Advanced Topics: QJM, HDFS Federation and Security

Learning Objectives– In this module, you will understand basics of Hadoop security, Managing security with Kerberos, HDFS Federation setup and Log Management. You will also understand HDFS High Availability using Quorum Journal Manager (QJM).

Topics– Configuring HDFS Federation, Basics of Hadoop Platform Security, Securing the Platform, Configuring Kerberos.


7. > Oozie, Hcatalog/Hive and HBase Administration

Learning Objectives– In this module, you will understand Setting up Apache Oozie Workflow Scheduler for Hadoop Jobs, Hcatalog/Hive Administration, deploying HBase with other Hadoop components, Using HBase effectively to load data, writing to and reading from HBase.

Topics –Oozie, Hcatalog/Hive Administration, HBase Architecture, HBase setup, HBase and Hive Integration, HBase performance optimization.

 

8. > Project: Hadoop Implementation

Learning Objectives –In this module, you will understand how multiple Hadoop ecosystem components work together in a Hadoop implementation to solve Big Data problems. You will also learn how to plan, design, and deploy a Hadoop Cluster using a typical Real-World Use Case.

Topics –Understanding the Problem, Plan, Design, and Create a Hadoop Cluster for a Real World Use Case, Setup and Configure commonly used Hadoop ecosystem components such as Pig and Hive, Configure Ganglia on the Hadoop cluster and troubleshoot the common Cluster Problems.

 

Opportunities for Hadoopers!

Opportunities for Hadoopers are infinite – from a Hadoop Architect, to a Hadoop Developer or a Hadoop Tester, and so on. If cracking and managing Big Data is your passion in life, then think no more and Join our Hadoop program and carve a niche for yourself! Happy Hadooping!

 

For program schedules please send an email to info@certfirst.com or call 1-630-684-0355.

For more information on how CertFirst can assist you please Contact Us

Payment Policy | Terms & Conditions

Close [X]