Hadoop Online training from www.

12
IT COURSES Online live classes www.imaginelife.in PH:8499068708 :8341832707 EMAIL:[email protected] Hadoop

Transcript of Hadoop Online training from www.

IT COURSESOnline live classes

www.imaginelife.inPH:8499068708 :8341832707

EMAIL:[email protected]

Hadoop

Topics To Be covered in Hadoop course

Introduction to Big Data & Hadoop

1.what is Big data?

2.what are the challenges for processing big

data?

3.what is Hadoop?

4.way Hadoop?

5.History of Hadoop

6.Use cases of Hadoop

7.Hadoop eco System

8.HDFS

9.Mapreduce

10.Statistics

Understanding the cluster

1.Typical workflow

2.Writing files to HDPS

3.Reading files from HDFS

4.Rack Awareness

5.5 Daemons

6.HDFS commands HANDS-on

Installing the cluster

1.CDH4 Pseudo cluster

2.CDH4 Multi node cluster

3.configuration

4.cluster on EC2 cloud

5.How to use AWS EMR

6.Hands –on Exercises

Routine Admin and Monitoring Activities

1.Meta Data and Data Backups

2.commissioning and Decommissioning nodes

3.Recover from Namenode Failure

4.Namenode High Availability

5.Monitoring using ganglia and Nagios

Let’s talk MapReduce

1.Before MapReduce

2.MapReduce Overview

3.word count problem

4.word count flow and solution

5.MapReduce flow

6.Algorithms for simple problems

7.Algorithms for complex problems

Developing the MapReduce Application

1.Data types

2.File Formats

3.Explain the Driver,Mapper and Reducer code

4. configuring Development environment-Eclipse

5.Writing Unit Test

6.Running locally

7.Hands –on exercises

How Map Reduce Works

1.Anatomy of map Reduce job run

2.Job Submission

3.Job initialization

4.Task Assignment

5.Job completion

6.Job Scheduling

7.Job Failures

8.shuffle and sort

9.Oozie Workflows

10.Hands –on Exercises

MapReduce Types and Formats

1.MapReduce Types

3.output Formats –text Output, binary output,multiple outputs

4.Lazy output and database output

5.Hands-on Exercises

MapReduce Features

1.Counters

2.Joins-map side and Reduce Side

3.Sorting

4.MapReduce combiner

5.MapReduce partitioner

6.MapReduce Distributed Cache

7.Hands-on Exercises

Hive

1. What is Hive?

2.what Hive is not?

3.Hive Architecture

4.SQL vs Hive QL

5.Data Types

6.Managed Tables and External Tables

7.partitions

8.Buckets

9.Storage formats

10.serDes

11.importing Data

12.Joins-map side and Reduce Side

13. UDFs

14.Hands-on Exercises

Imapla

1.Need for RTQ

2.impala Overview

3.Impala Architecture

4.Hands-on Exercises

Pig

1.What is Pig? Why Pig?

2.Running pig

3.Data

4.pig Latin Statements

5.Schemas

6.Validations

7.Functions and Macros

8.UDFs

9.When to Use pig and HIVE

10.Hands-on Exercises

NoSQL and HBase

1.Why noSQL?

2.Problems with RDBMS

3.cap theorem

4.HBase Concepts

5.Use Cases for HBase

6.HBase Data Model

7.HBase Shell

8.HBase Architecture

9.Minor & major Compaction

10.Bloom Filter&Block cache

11.Schema Design

12.Hands-0n Exercises

Sqoop

1.What is Sqoop?

2.Motivation

3.Sqoop Commands

4.Importing Data to

HDFS

HIVE

HBase

5.Exposing Data

6.Sqoop Connectors

7.Hands on Exercises

FLUME

1.What is Flume?

2.Use Cases

3.Flume Topology –Source,Channel and Sink

4.Hands-on Execises –Ingest Data from twitter and Analyze With

Hive

Machine Learning and Mahout

1.3Cs of Machine Learning

2.Introduction to Mahout

3.Hands-on Exercise:Build a Recommendation System using

Mahout

POCs

1.Banking Use case

2.Telecom Use case