Nawanshahr, Mohali
+91-9915536076, Call -> +91-99144-03555
honey.cse@nitttrchd.ac.in, info@ekaim.in

Big Data

Aim to Achieve

Big Data & Hadoop Course Content
Course Duration 40 hours(20 Days)
Daily Live Sessions for 2 hrs

What you will Learn

Big data is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software. Data with many cases (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.

Module 1 – INTRODUCTION TO BIG DATA

➢ What is Big Data?

➢ Examples of Big Data

➢ Reasons of Big Data Generation

➢ Why Big Data deserves your attention

➢ Use cases of Big Data

➢ Different options of analyzing Big Data

Module 2 – INTRODUCTION TO HADOOP

➢ What is Hadoop,

➢ History of Hadoop

➢ How Hadoop name was given

➢ Problems with Traditional Large-Scale Systems and Need for Hadoop

➢ Where Hadoop is being used

➢ Understanding distributed systems and Hadoop

➢ RDBMS and Hadoop

MODULE 3- STARTING HADOOP

➢ Hadoop Architecture

  • Apache Hadoop Installation
    • Standalone Mode
    • Pseudo Distributed Mode
    • Fully Distributed Mode
  • Cloudera Installation

➢ Features of Hadoop

➢ Hadoop Components- HDFS, Map Reduce

➢ Anatomy of File write / read

➢ Introduce other components of Hadoop ecosystem

MODULE 4- HDFS

➢ HDFS Commands.

➢ Single node hadoop cluster

➢ Understanding hadoop configuration files

➢ Overview Of Hadoop Distributed File System

  • Name nodes
  • Data nodes
  • The Command-Line Interface

➢ The building blocks of Hadoop.

➢ Running HDFS Commands

➢ Web-based cluster UI-Name Node UI, Map Reduce UI

MODULE 5- UNDERSTANDING MAP REDUCE

➢ How Map Reduce Works

  • Data flow in MapReduce
  • Map operation
  • Reduce operation

➢ Input and Output Formats

➢ Partitions

➢ Combiners

➢ Schedulers

➢ MapReduce Program In JAVA using Eclipse

➢ Counting words with Hadoop—Running program

➢ Writing MapReduce Drivers, Mappers and Reducers in Java

➢ Real-world “MapReduce” problems

Writing a MapReduce Program and Running a MapReduce Job

➢ Java WordCount Code Walkthrough

MODULE 6- HADOOP ECOSYSTEM

➢ Hive

➢ Sqoop

➢ Pig

MODULE 7- EXTENDED SUBJECTS ON HIVE

➢ Installing Hive

➢ Introduction to Apache Hive

➢ Getting data into Hive

➢ Hive’s architecture

➢ Hive-HQL

➢ Query execution

➢ Programming Practices and projects in Hive

➢ Troubleshooting

➢ Hive Programming

MODULE 8- EXTENDED SUBJECTS ON PIG

➢ Introduction to Apache Pig

➢ Install Pig

➢ Pig architecture

➢ Pig Latin – Reading and writing data using Pig

➢ Programming with pig, Load data, execute data processing statements

WhatsApp chat