BIG-DATA AND HADOOP DEVELOPMENT COURSE PREVIEW


Type

: Advanced Training Program

Audiences

: NET Beginners/Professionals

IDE

: Visual Studio 2015, SQL Server 2014

Delivery methods

: Instructor-led Classroom/Online Training

Duration

: 6 Days

Language

: English




  • When we talk about the software domain, many technologies remain unfocussed. We have designed and integrated many technologies like Big-data and Hadoop development for Software professionals/System Architects/IT Managers to manage large, complex data sets and to scale it up form single server to thousands of machines. In this course, the trainee’s shall be exposed to the basic and advance in-depth concepts of Big-data and Hadoop along with implementation on varied industry use-cases.

    Course objective

    Exploring Hadoop 2.x Architecture.

    Need and advantages of Big-Data and Hadoop.

    Mastering the concepts of HDFS and MapReduce framework

    How to setup Hadoop cluster and write complex MapReduce programs.

    Implementing H-Base and MapReduce integration.

    Performing data analytics using Pig, Hive and YARN.



    Who should do this course?

    All professionals who are keen to learn how to manage large and complex data sets and scale up it from single servers to thousands of machines should go for this course.

    Pre-requisites

    Anyone who wants to learn Big-Data and Hadoop development should have a basic knowledge of Java programming language.

  • Big-Data and Hadoop

    Introduction to Big-Data & Hadoop

    1. Limitations of RDBMS

    2. Need for Big-Data

    3. 3 Vs of Big-Data - Volume, Velocity and Variety

    4. Introduction to Hadoop

    5. History of Hadoop Evolution

    6. Organizations using Hadoop

    7. Hadoop Job Trend in India

    Hadoop Components

    Hadoop Core Components

    HDFS

    Regular File System Vs HDFS

    Name-Node

    Data-Node

    Secondary Name-Node

    Data-Block Split

    Benefits of Data Block Approach

    HDFS-Block Replication Architecture

    Data Replication Technology

    HDFS Access

    Configure System

    1. Introduction to Virtual Box

    2. Creating and Configuring Linux UBuntu Server 14.04 machines in Virtual Box

    3. Network Configuration in Virtual Box to communicate machines

    4. Introduction to Linux Environment

    5. SSH

    5. 6. SCP

    5. 7. Passwordless SSH creation between two machines

    8. Java Setup on Linux

    View

    Types of Views

    Creating Standard View

    Creating Layout Page

    Communication between Controller and View

    Configure Hadoop

    1. Hadoop1.x installation

    2. Configure different Configuration files of Hadoop1.x

    3. Configure Hadoop Environment Variables

    4. Running Hadoop1.x on Linux and view HDFS Daemons Name-Node,Data-Node,Secondary Name-node

    5. Start/Stop Hadoop Daemons together

    6. Start/Stop Hadoop Daemons Individually

    7. HDFS operations from command line

    8. Hdfs web interface to view hdfs components

    MapReduce

    Introduction to MapReduce

    1. MapReduce Overview

    2. Introduction to JobTracker (Hadoop1.x)

    3. Introduction to Task Tracker (Hadoop1.x)

    4. Hadoop1.x Job Submission to job complete process architecture

    5. MapReduse Analogy (Sort-Shuffle)

    Word Count App

    1. Developing a Word Count Application on eclipse

    2. Running WordCount application on Hadoop1.x

    3. Analyze Application through Command line

    4. Analyze Application through web interface

    Hadoop I/O

    1. Hadoop I/O

    2. Different I/O formats

    3. Input Split

    4. Writable Interface

    Files

    1. Sequence File

    2. Map File

    MapReduce Partitioner

    1. Understanding MapReduce Custom Partitioner

    2. Creating an application through Eclipse for Custom Partitioner and run on Yarn

    3. Understanding Combiner Function

    4. Creating an application through Eclipse for Combiner Function and run on Yarn

    MapReduce Features

    1. Understanding Map Side Join

    2. 2. Understanding Distributed Cache

    3. 3. Creating a MapReduce application for Distributed Cache and run on Yarn

    4. 4. Understanding Partial Sort

    4. 5. Creating Application for Partial Sort and run on Yarn

    4. 6. Understanding Total Order Sort

    7. Creating Application for Total Order Sort and run on Yarn

    Sorting

    1. 1. Understanding Reduce Side Join

    2. Understanding Secondary Sort

    3. Creating application for Secondary Sort and run on yarn

    Yarn

    Introduction to Yarn

    1. Yarn (Hadoop2) Overview

    2. Understanding Resource Manager

    3. Understanding NodeManager

    4. Yarn Job Submission

    Configure Yarn

    1. Installing Yarn on Linux machine

    2. Running Yarn Daemons

    3. Start/Stop Daemons individually

    4. Start/Stop Daemons together

    5. Yarn Command line utility for HDFS interaction

    6. Yarn web interface

    7. Running word count application on yarn

    Hadoop Administration

    1. Hadoop2 (yarn) cluster setup

    2. Hadoop Administration

    3. Understanding Safe Mode

    Hive

    Introduction to Hive

    1. Hive Overview

    2. Hive History

    3. Hive Installation

    4. Configure Hive Configuration Files

    5. Configure Environment Variables for Hive

    6. Creating tables and loading data into tables

    7. Understanding Hive Ware house file system through HDFS webinterface

    8. Running SQL queries in hive tables from hive shell

    9. Running SQL Queries from file on hive

    Hive Features

    1. Understanding Hive external tables

    2. Creating external tables and tweaking

    3. Installing MySql on Linux system

    4. Configure Hive meta store with MySQL

    5. Understanding hive Joins

    6. Performing joins over hive tables

    Hive Partitioning

    1. Understanding Distribute By clause

    2. Hive Partitioning

    3. Strict Mode and Dynamic Partitioning

    4. Bucketing

    Hive Functions

    1. Understanding Hive Functions

    2. Understanding UDF (User Defined Functions)

    3. Creating User Defined Function through Eclipse

    4. Configure and run UDF from sql query

    Hive Security

    1. 1. Understanding Security in Hive

    2. 2. Implementing Security

  • TrainingNCR.com assures weekly mock-up tests and regular assignments to help the students cement their foundation and have a real work-like scenario. The total tests and assignments have no limit for diligent students.

DROP US YOUR QUERY

Request a Callback