Big data Training
|Program Name||Big data Expertise Program|
|Length ||8 Weeks |
|Project Implementation ||2 Real Time Projects |
|Batch Size ||10 |
|Pre-requisites ||Laptop with 8 GB Ram |
|Job Guidance ||Yes (Supported by Placement Cell) |
|Certificate ||Yes |
|Start date ||June 22, 2019|
Getting Started with Hadoop
- Hadoop Distributed file System
- Comparing Hadoop and SQL
- Industries using Hadoop
- Data Locality
- Hadoop Architecture
- Map Reduce and HDFS
- Using Hadoop Single node image(Clone)
Hadoop Distributed File System
- HDFS Design and Concepts
- Blocks Name nodes and Data Nodes
- HDFS High Availability and HDFS Federation
- Hadoop DFS command line interface
- Basic File System Operations
- Anatomy of File Read
- Anatomy of File Write
- Block Placement Policy and Modes
- More detailed explanation about Configuration files.
- Metadata, FS image, Edit log, Secondary Name Node and Safe Mode.
- How to add New Data Node dynamically.
- How to decommission a Data Node dynamically (Without stopping cluster).
- FSCK Utility. (Block report).
- How to override default configuration at system level and Programming level.
- HDFS Federation.
- ZOOKEEPER Leader Election Algorithm.
- Exercise and small use case on HDFS
Map Reduce and Yarn
- Functional Programming Basics.
- Map and Reduce Basics
- How Map Reduce Works
- Anatomy of a Map Reduce Job Run
- Legacy Architecture ->Job Submission,Job Initialization, Task Assignment, Task Execution, Progress and Status Updates
- Job Completion, Failures
- Shuffling and Sorting
- Splits, Record reader, Partition, Types of partitions & Combiner
- Optimization Techniques -> Speculative Execution, JVM Reuse and No. Slots.
- Types of Schedulers and Counters.
- Comparisons between Old and New API at code and Architecture Level.
- Getting the data from RDBMS into HDFS using Custom data types.
- Distributed Cache and Hadoop Streaming (Python, Ruby and R). - YARN.
- Sequential Files and Map Files.
- Enabling Compression Codec’s.
- Map side Join with distributed Cache.
- Types of I/O Formats: Multiple outputs, NLINEinputformat.
- Handling small files using CombineFileInputFormat.
Map Reducing Programming – Sample Word Count Program
- Hands on “Word Count” in Map/Reduce in standalone and Pseudo distribution Mode.
- Sorting files using Hadoop Configuration API discussion
- Emulating “grep” for searching inside a file in Hadoop
- DBInput Format
- Job Dependency API discussion
- Input Format API discussion
- Input Split API discussion
- Custom Data type creation in Hadoop.
Please enter your Details
Please enter your Details