course detail

Big data Training

Program Name
Big data Expertise Program
8 Weeks 
Project Implementation 
2 Real Time Projects 
Batch Size 
Laptop with 8 GB Ram 
Job Guidance 
Yes (Supported by Placement Cell) 
Start date 
June 22, 2019

Course Content Overview

Induction Program

  • Understanding the importance of learning Big data, Role of Big data in Market.
  • The detailed explanation of best practices to Big data

Introduction to Big data

  •   What is Big data?

  •   Big Data opportunities

  •   Big Data Challenges
  •   ADB commands

  • Introduction to Hadoop
  • High Availability
  • Scaling
  • Big Data Challenges
  • Advantages and Challenges

Getting Started with Hadoop

- Hadoop Distributed file System

- Comparing Hadoop and SQL

- Industries using Hadoop

- Data Locality

- Hadoop Architecture

- Map Reduce and HDFS

- Using Hadoop Single node image(Clone)

Hadoop Distributed File System

- HDFS Design and Concepts

- Blocks Name nodes and Data Nodes

- HDFS High Availability and HDFS Federation

- Hadoop DFS command line interface

- Basic File System Operations

- Anatomy of File Read

- Anatomy of File Write

- Block Placement Policy and Modes

- More detailed explanation about Configuration files.

- Metadata, FS image, Edit log, Secondary Name Node and Safe Mode.

- How to add New Data Node dynamically.

- How to decommission a Data Node dynamically (Without stopping cluster).

- FSCK Utility. (Block report).

- How to override default configuration at system level and Programming level.

- HDFS Federation.

- ZOOKEEPER Leader Election Algorithm.

- Exercise and small use case on HDFS

Map Reduce and Yarn

- Functional Programming Basics.

- Map and Reduce Basics

- How Map Reduce Works

- Anatomy of a Map Reduce Job Run

- Legacy Architecture ->Job Submission,Job Initialization, Task Assignment, Task Execution, Progress and Status Updates

- Job Completion, Failures

- Shuffling and Sorting

- Splits, Record reader, Partition, Types of partitions & Combiner

- Optimization Techniques -> Speculative Execution, JVM Reuse and No. Slots.

- Types of Schedulers and Counters.

- Comparisons between Old and New API at code and Architecture Level.

- Getting the data from RDBMS into HDFS using Custom data types.

- Distributed Cache and Hadoop Streaming (Python, Ruby and R). - YARN.

- Sequential Files and Map Files.

- Enabling Compression Codec’s.

- Map side Join with distributed Cache.

- Types of I/O Formats: Multiple outputs, NLINEinputformat.

- Handling small files using CombineFileInputFormat.

Map Reducing Programming – Sample Word Count Program

- Hands on “Word Count” in Map/Reduce in standalone and Pseudo distribution Mode.

- Sorting files using Hadoop Configuration API discussion

- Emulating “grep” for searching inside a file in Hadoop

- DBInput Format

- Job Dependency API discussion

- Input Format API discussion

- Input Split API discussion

- Custom Data type creation in Hadoop.

Outcome of the Course

  •                    Candidate who completes Big Data Expertise Program successfully: -
          • Will be able to work on any Big Data project individually.
          • Will be able to understand any kind of Big Data Architecture.
          • Will be able to design and develop Big Data Architecture.
          • Will be able to write Programs and will be Good in logical thinking. 
          • Will be able to meet the Industry Expectations in Big Data. 
          • Will be able to crack interview.

Enroll Now



Please enter your Details