Apache Hadoop Developer’s Track @ Santa Clara, CA

Linux Professional Institute  This class is being offered in partnership with the Linux Professional Institute.

The Apache Hadoop Developer’s deeply explores Hadoop Architecture, Ecosystem and Implementation considerations.

This track stresses the combined Hadoop literacy concepts as learned through core concept instructions, hands-on lab work and detailed class discussions.

The only way to engage industry demands is with strong fundamentals, and the Apache Hadoop Developer’s track provides that crucial intellectual infrastructure.

The core concepts taught in this track include:

  • The importance of Hadoop in today’s world
  • The technological fuels of MapReduce, Hive & Pig
  • Overview of MapReduce, Hive &
  • Appropriate, and non-appropriate, application environments for these programming paradigms

The instructors review Hadoop’s essential server components and detail its relation to MapReduce. Hive & Pig programming. Learners discover the integrated process with hands-on labs. Hadoop fine tuning parameters, definitive guidelines for distributed cluster setup, MapReduce, Hive & Pig programming and real-time monitoring, are covered in this track.

Distributed multinode Hadoop clusters are provided by ClustersTogo.com.
The Hadoop clusters can be accessed & worked on by simply using any browser, without the need to download, install & setup anything.

The Apache Hadoop Developer’s track is the ultimate introductory experience for future industry players. It’s thorough, accessible and industry-driven. Students can make a fun and challenging weekend of it and emerge with the empowerment of foundational knowledge.

The Apache Hadoop Developer’s track is a core component of the “Technology Series” of classes.

Lab Work

The Apache Hadoop Developer’s Track is distinct from any other such industry class, because lab work comprises over 60 percent of the total course work.
This track places learners in the “hands-on” industry hot seat within a safe, nurturing class environment. The students are guided by Big Data gurus who have years of industry experience & can answer detailed technical questions. Students have the dual benefits of working on their own Hadoop clusters as part of a team setting during the class while also being able to work on the Hadoop clusters at their own time and leisure. Students can excel individually, or in groups as best they prefer.
These clusters are provided by ClustersTogo.com.

Lab training utilizes Click Stream and Twitter data in the MapReduce, Hive & Pig labs.


Basic Unix skill, Basic Java programming & SQL knowledge.

Course Duration

1 Full Day

Class Date & Time

10/05/2013 from 9:00 AM to 5:00 PM
10/19/2013 from 9:00 AM to 5:00 PM
11/02/2013 from 9:00 AM to 5:00 PM
11/16/2013 from 9:00 AM to 5:00 PM
12/07/2013 from 9:00 AM to 5:00 PM

Class Location

  • Onsite :
    3200 Coronado Dr, Santa Clara, CA 95054
  • Online :
    Access information would be sent prior to the class


Developers, Business Analysts, Managers, Administrators


A Day Prior to the Class – 6 to 7pm PST

  • Meet and Greet students – Online
  • Hadoop Setup – Support
  • Clusterstogo.com – Overview

Day 1

  • Hadoop Intro and Architecture
  • Hadoop Ecosystem
  • Reporting & ETL with Hive & Pig
  • Map Reduce Programming & Performance Monitoring
  • HBase – Random Access vs. Hadoop’s Batch Processing
  • HDFS File structure – read/write data flow
  • HDFS Lab
  • Hadoop Administration – High Level Overview
  • Demo Hadoop Administration – Features
    –   Fsck, TestDFSIO, benchmarking, configuration files, etc.
  • Hive – Concepts and Reporting
  • Hive Lab – DDLs, DMLs, data types, Join
  • Pig Lab – Concepts
  • Pig Lab – Pig Latin and ETL
  • Map Reduce Programming – Concepts
  • Map Reduce Programming Lab
  • Machine Learning – Introduction with use cases

A Day after the Class: 6 to 7pm PST

  • Follow Up Questions and Support

Recommended Readings

  1. Hadoop Definitive Guide – by Tom White
  2. Programming Pig – by Alan Gates