Real Life End-to-End Development & Deployment of a Hadoop Ecosystem.
A hands-on Hadoop Workshop which focuses on creating Hadoop power users with hands-on labs that includes real life Hadoop usage patterns that are commonly used in the industry.
This Workshop showcases:
- Real life usage of Hadoop using Hive & Pig
- Discuss options for data lifecycle : capture, analyze, store & archive data
- Various patterns of ETL into HDFS
Students will be doing the lab in real Hadoop clusters, one per student.They will learn the concepts and also do the high level implementation using Hive/Pig and Map Reduce programs.

Click Stream Analysis
In this part of the lab, students will do a ‘Click Stream Analysis’ workload.
We will simulate an online ad serving agency by tracking an ad campaign performance — how many views did we get vs for how many clicks.
This lab involves:
- Exploring options for ingesting clickstream data into Hadoop HDFS
- Analyzing the logs using Pig and MapReduce Programs
Order Processing in a Hadoop datawarehouse
In this lab, we will simulate a scenario where order files are processed in a Hadoop datawarehouse using techniques like HBase bulkloader and migrated to Hive for HQL analysis.
This lab involves:
- Loading order data into Hadoop
- Bulkload data into HBase
- Create external tables in Hive out of HBase tables
- Run analytics on Hive order tables
- Run map/reduce to filter order data and extract things like “Delivered orders” only.
Audience
Developers,Database administrators, Data Analytics professionals, Data architects, Managers.
Prerequisites
Developers with basic understanding of Hadoop, MapReduce, Hive, Pig and HBase.
Developers with basic understanding of database and ACID transactions.
Course Duration:
A Whole Day Class: 9am till 5pm
Location:
3200 Coronado Avenue, Santa Clara, CA
Registration:
Please contact us at info@bdcuniversity.com. We offer our classes using High Tech Web Conferencing Style as well.
Contact Information:
jeetadas@thirdeyecss.com
408-256-3282




