Recommendation Engines – End-to-End Development & Deployment.

A hands-on Hadoop Workshop which focuses on the application of Hadoop for solving a real world business problem involving Big Data. "Recommendations" for products, services or content are a very common feature in many sites these days and this workshop delves into developing and deploying one. The workshop will cover series of data mining algorithms implemented on Hadoop to solve recommendation problems. It will start with the cold start problem involving matching between user profile and products, followed by the warm start problem which is solved by finding similarities between products. These solutions are based on distance based algorithms in a multi dimensional feature space. Then it will cover a set of recommendation solutions based on social or user behavior data. It will include in the order of complexity, simple cosinr and jaccard similarity, slope one algorithms and finally collaborative filtering. The workshop will draw on actual implementation done by the instructor and be available as an open source project on github.

Prerequisites


Java, Hadoop MapReduce and background in distributed computation.

Audience


Developers,Database administrators,Data Analytics professionals,Data architects,Managers.

Class Date


May 12th 2013

Class Duration


8 hours

Class Location


3200 Coronado Dr, Santa Clara, CA 95054

Registration


Option 1: Pay using Paypal at training@thirdeyecss.com 24 hours before the class. Option 2: Send us a check payable to "Third Eye CSS" at the mailing Address : 5201 Great America Parkway, Suite 320, Santa Clara, CA 95054. Check must be received 24 hours before the class start time. Option 3:  

Contact Information:


jeetadas@thirdeyecss.com 408-256-3282
Recommendation Engines – End-to-End Development & Deployment.

Real Life End-to-End Development & Deployment of a Hadoop Ecosystem.

A hands-on Hadoop Workshop which focuses on creating Hadoop power users with hands-on labs that includes real life Hadoop usage patterns that are commonly used in the industry. This Workshop showcases:
  • Real life usage of Hadoop using Hive & Pig
  • Discuss options for data lifecycle : capture, analyze, store & archive data
  • Various patterns of ETL into HDFS
Students will be doing the lab in real Hadoop clusters, one per student.They will learn the concepts and also do the high level implementation using Hive/Pig and Map Reduce programs. Real Life End-to-End Development & Deployment of a Hadoop Ecosystem.

Click Stream Analysis


In this part of the lab, students will do a ‘Click Stream Analysis’ workload. We will simulate an online ad serving agency by tracking an ad campaign performance -- how many views did we get vs for how many clicks. This lab involves:
  • Exploring options for ingesting clickstream data into Hadoop HDFS
  • Analyzing the logs using Pig and MapReduce Programs
 

Order Processing in a Hadoop datawarehouse


In this lab, we will simulate a scenario where order files are processed in a Hadoop datawarehouse using techniques like HBase bulkloader and migrated to Hive for HQL analysis. This lab involves:
  • Loading order data into Hadoop
  • Bulkload data into HBase
  • Create external tables in Hive out of HBase tables
  • Run analytics on Hive order tables
  • Run map/reduce to filter order data and extract things like “Delivered orders” only.
 

Audience


Developers,Database administrators, Data Analytics professionals, Data architects, Managers.

Prerequisites


Developers with basic understanding of Hadoop, MapReduce, Hive, Pig and HBase. Developers with basic understanding of database and ACID transactions.

Course Duration:


A Whole Day Class: 9am till 5pm  

Location:


3200 Coronado Avenue, Santa Clara, CA

Registration:


Please contact us at info@bdcuniversity.com. We offer our classes using High Tech Web Conferencing Style as well.

Contact Information:


jeetadas@thirdeyecss.com 408-256-3282
Real Life End-to-End Development & Deployment of a Hadoop Ecosystem.
click to chat