Real Life End-to-End Development & Deployment of a Hadoop Ecosystem.
A hands-on Hadoop Workshop which focuses on creating Hadoop power users with hands-on labs that includes real life Hadoop usage patterns that are commonly used in the industry.
This Workshop showcases:
- Real life usage of Hadoop using Hive & Pig
- Discuss options for data lifecycle : capture, analyze, store & archive data
- Various patterns of ETL into HDFS
Students will be doing the lab in real Hadoop clusters, one per student.They will learn the concepts and also do the high level implementation using Hive/Pig and Map Reduce programs.
Click Stream Analysis
In this part of the lab, students will do a ‘Click Stream Analysis’ workload.
We will simulate an online ad serving agency by tracking an ad campaign performance — how many views did we get vs for how many clicks.
This lab involves:
- Exploring options for ingesting clickstream data into Hadoop HDFS
- Analyzing the logs using Pig and MapReduce Programs
Order Processing in a Hadoop datawarehouse
In this lab, we will simulate a scenario where order files are processed in a Hadoop datawarehouse using techniques like HBase bulkloader and migrated to Hive for HQL analysis.
This lab involves:
- Loading order data into Hadoop
- Bulkload data into HBase
- Create external tables in Hive out of HBase tables
- Run analytics on Hive order tables
- Run map/reduce to filter order data and extract things like “Delivered orders” only.
Developers,Database administrators, Data Analytics professionals, Data architects, Managers.
Developers with basic understanding of Hadoop, MapReduce, Hive, Pig and HBase.
Developers with basic understanding of database and ACID transactions.
A Whole Day Class: 9am till 5pm
3200 Coronado Avenue, Santa Clara, CA
Please contact us at firstname.lastname@example.org. We offer our classes using High Tech Web Conferencing Style as well.