
Apache Hadoop Developer’s Track – 1 Day Course
The Apache Hadoop Developer's - 1 Day Course deeply explores Hadoop Architecture, Ecosystem and Implementation considerations. This track stresses the combined Hadoop literacy concepts as learned through core concept instructions, hands-on lab work and detailed class discussions. The only way to engage industry demands is with strong fundamentals, and the Apache Hadoop Developer's track provides that crucial intellectual infrastructure. The core concepts taught in this track include:- The importance of Hadoop in today's world
- The technological fuels of MapReduce, Hive & Pig
- Overview of MapReduce, Hive & Pig as an almost-universal template for Big Data analytics
- Appropriate, and non-appropriate, application environments for these programming paradigms
Lab Work
The Apache Hadoop Developer's Track is distinct from any other such industry class, because lab work comprises over 60 percent of the total course work. This track s places learners in the "hands-on" industry hot seat within a safe, nurturing class environment. The students are guided by Big Data gurus who have years of industry experience & can answer detailed technical questions. Students have the dual benefits of working on their own Hadoop clusters as part of a team setting during the class while also being able to work on the Hadoop clusters at their own time and leisure. Students can excel individually, or in groups as best they prefer. These clusters are provided by ClustersTogo.com. Lab training utilizes Click Stream and Twitter data in the MapReduce, Hive & Pig labs.Prerequisites
Basic Unix skill, Basic Java programming & SQL knowledge.Course Duration
1 Full DayClass Date & Time
May 25th - 8:30 am to 5:30 pmCourse Location
- Onsite : 3200 Coronado Dr, Santa Clara, CA 95054
- Online : Access Information would be sent prior to the class
Audience
Developers, Business Analysts, Managers, AdministratorsAgenda
A Day Prior to the Class – 6 to 7pm PST- Meet and Greet students – Online
- Hadoop Setup – Support
- Clusterstogo.com - Overview
- Hadoop Intro and Architecture
- Hadoop Ecosystem
- Reporting & ETL with Hive & Pig
- Map Reduce Programming & Performance Monitoring
- HBase - Random Access vs. Hadoop's Batch Processing
- HDFS File structure – read/write data flow
- Hadoop Administration – High Level Overview
- Demo Hadoop Administration – Features - Fsck, TestDFSIO, benchmarking, configuration files, etc.
- Lab - Map Reduce Programming Intro
- Lab – RDBM to Hadoop using Sqoop
- Lab – Using Flume
- Hive - Concepts and Reporting
- Hive Lab - DDLs, DMLs, data types, join
- Pig Lab - Concepts
- Pig Lab – Pig Latin and ETL
- Follow Up Questions and Support
Recommended Readings
- Hadoop
- Hadoop Definitive Guide – by Tom White
- Programming Pig – by Alan Gates
Registration

Apache Hadoop Developer’s Track – 1 Day Course
FEATURED / Technology SeriesApache Hadoop: Architecture & Ecosystem – Jump Start
This course is designed to provide a basic understanding of the Hadoop architecture and its ecosystem. This is the first step for anyone aspiring to be a Big Data professional or just understand the Hadoop ecosystem. This course helps the students to transition from current-world RDBM based structured data management to the file based and unstructured database world of Hadoop. The course gets into the specifics of key parts of the Hadoop architecture, Hive, Pig and Map Reduce Programming with real-life use cases.Prerequisites
None! (Just your keen interest will do!)
Audience
Business & Management Personnel, Young Software Developers
Recommended Readings
- O'Reilly's ‘Hadoop’ book by Tom White
Class Date
May 11th 2013
Class Duration
4 hours
Class Location
3200 Coronado Dr, Santa Clara, CA 95054
Registration
Option 1: Pay using Paypal at training@thirdeyecss.com 24 hours before the class. Option 2: Send us a check payable to "Third Eye CSS" at the mailing Address : 3200 Coronado Dr, Santa Clara, CA 95054. Check must be received 24 hours before the class start time. Option 3:
Contact Information:
For any additional information, please email at jeetadas@thirdeyecss.com Call or text at (408) 306-8462

Apache Hadoop: Architecture & Ecosystem – Jump Start
4 Hour Classes / Technology Series
Map Reduce Programming – Deep Dive
An exhaustive class which covers in-depth of all MapReduce concepts. Students will learn:- Parallel processing, functional programming as the foundation for Hadoop.
- How map and reduce work.
- How map and reduce collaborate through shuffle.
- HDFS fundamentals. Input, output formats.
- Simple examples of Map Reduce with Java & Map Reduce with Streaming.
- Anatomy of a Hadoop job: Job Submission & Execution.
- Compression, serialization.
- Configuration and tuning.
- Multiple map reduce jobs and Hadoop workflow.
- Monitoring and error handling.
- Deal with complex Map Reduce examples.
Lab Work
Hands on lab exercises working with Big Data sets on a Hadoop cluster running on Amazon EC2.
Prerequisites
Basic Linux command line skills and server-side Java experience
Audience
Developers, Data Analytics professionals, Business Analysts, Managers
Recommended Readings
- O'Reilly's ‘Hadoop’ book by Tom White - Hadoop tutorial on YDN
Class Date
May 26th 2013
Class Duration
4 hours Class
Class Location
3200 Coronado Drive, Santa Clara,CA 95054
Registration
Option 1: Pay using Paypal at training@thirdeyecss.com 24 hours before the class. Option 2: Send us a check payable to "Third Eye CSS" at the mailing Address : 3200 Coronado Dr, Santa Clara, CA 95054. Check must be received 24 hours before the class start time. Option 3:
Contact Information:
Training Department training@thirdeyecss.com (408) 290-9949 – Ext 3

Map Reduce Programming – Deep Dive
4 Hour Classes / FEATURED / Technology Series
Apache Hive & Pig – BI Developer
BI Developers need to access, transform & load data sets. For performing these activities over Big Data sets, in a Hadoop environment, Hive and Pig are extremely handy skills to have. In this one (1) day course , we will learn in-depth about Hive and Pig's architecture & design and development framework including installation steps and performance tuning of Map Reduce Programs covering SessionLog Data and other business subject areas. We will also learn implementation of various analytics and ETL processes using Hive & Pig for Big Data.We will also go over the ecosystem of Hadoop data management tools & framework. The class includes hands on labs where students work on actual Hadoop clusters & write Hive & Pig code to be ready for a Hive and Pig developer's role.Lab Work
- Hive Labs
- PIG Labs
Prerequisites
Basic Linux command line skills. HDFS file system handling knowledge will help.Audience
Developers, IT Administrators, Business Analysts, Data Scientist.Course Duration:
A Whole Day Class: Duration 9:30am till 4:30pm, Saturdays or request a date.Class Date & Time
March 9th 2013 - 9:00 am to 6:00 pmLocation:
Third Eye CSS's Training Center I 3200 Coronado Dr Santa Clara, CA 95054Registration:
Contact Information:
For any additional information, please Contact Us. Or contact Jeeta at 408 306 8462 or email at jeetadas@thirdeyecss.com.
Apache Hive & Pig – BI Developer
FEATURED / Technology Series
Pentaho Big Data BI Developer – ETL & Report
This class has been especially created for non-programmers in mind. The typical audience for this class would be BI developers who have been using front end tools like Business Objects, Cognos, Informatica etc. This is a class with hands-on labs that will give its students a very good head start to work with the Hadoop ecosystem without actually having to code in Map-Reduce.- Execute Hadoop map/reduce with zero code
- Intuitive, visual drag and drop designer
- Widgets to perform a wide range of operations with Hadoop, HBase and Hive
- Design complex transformations that integrate Hadoop with other external systems like databases
- Ideal for non-programmers who still have to work with Hadoop
Labs
This class is heavily focused on lab work with a lot of exercises that involve orchestration of processes built in Pentaho for ETL and Reporting Exercises.
Audience
BI developers who have been using front end tools like Business Objects, Cognos etc. and now would like to do so in the Big Data world.
Prerequisites
Knowledge of BI tools & report development
Course Duration
A Whole Day Class: 9am till 5pm, Saturdays Our Next class is on the
- 28th of April
Location
2900 Gordon Ave, Suite 100-20 Santa Clara, CA 95051
Registration
To Register please contact us at info@bdcuniversity.com.
Contact Information:
info@bdcuniversity.com 408-256-3282

Pentaho Big Data BI Developer – ETL & Report
FEATURED / Technology SeriesHive Administration & HiveQL Analytics – Deep Dive
For a file based system like Hadoop, developers need a mechanism to query data using a SQL like language. This is where Hive comes in. This class covers all major areas of Hive with extensive labs. This class empowers the students with the necessary knowledge to effectively function as Hive Developer in the Big Data marketplace. This class covers the following areas:- Installation
- Architecture
- Metastores
- Data Modeling
- UDF
- HiveQL - Basics & Advanced Concepts
- Integrating with HBase
Prerequisites
Basic Linux command line skills , DB knowledge, MPP architecture knowledge, HDFS is a must.
Recommended Next Class
Pig Basics and Advanced; Part of Hadoop BI Developer's Track
Audience
Developers, IT Administrators, Managers, Analysts, Data Scientist
What to bring to your class
Your computer, Any SSH Client like putty.exe
Recommended Readings
- O'Reilly's ‘Hadoop’ book by Tom White
Class Date
May 26th 2013
Class Duration
4 hours
Class Location
3200 Coronado Dr, Santa Clara, CA 95054, 408 306 8462
Price & Registration
Option 1: Pay using Paypal at training@thirdeyecss.com 24 hours before the class. Option 2: Send us a check payable to "Third Eye CSS" at the mailing Address : 5201 Great America Parkway, Suite 320, Santa Clara, CA 95054. Check must be received 24 hours before the class start time. Option 3:
Contact Information:
Training Department training@thirdeyecss.com (408) 290-9949 – Ext 3

Hive Administration & HiveQL Analytics – Deep Dive
4 Hour Classes / FEATURED / Technology Series
Apache Cassandra – Data Modeling Concepts
Apache Cassandra is a highly scalable, high performance and fault tolerant distributed data infrastructure. Cassandra solves both real time and analytical big data problems, from write intensive workloads to sub millisecond caching layer reads to analytical workloads involving petabytes of data using MapReduce. Offering distribution of data across multiple data centers and incremental scalability with no single points of failure, Cassandra is the logical choice when you need reliability without compromising performance. An introductory class which focuses on imparting the core concepts, architecture and design of Apache Cassandra.Course Contents
- Overall Cassandra Architecture
- Cassandra Strengths and weaknesses
- Major and minor features
- Various Replica placement strategies
- CAP theorem
- ACID transactions
- Data Modeling - Basic Concepts - Thinking in noSql mode - Denormalization - Some use cases
- Cassandra tools to manipulate Cassandra schema
Real Time Demos
This training session involves interactive demos on the following topics:
- Datastax OPScenter demo in EC2(how to manage cassandra)
- Cassandra CLI
- Cassandra CQL
- Datastax enterprise in EC2
- How to run map reduce on cassandra in EC2
- How to use the same cassandra cluster for real time transactions and analytics
Prerequisites
Developers with basic understanding of database and ACID transactiions
Audience
Developers, Database administrators, Data Analytics professionals, Data architects, Managers
Recommended Readings
Class Date
May 26th 2013
Class Duration
8 hours Class
Class Location
3200 Coronado Drive, Santa Clara, CA 95054
Registration
Option 1: Pay using Paypal at training@thirdeyecss.com 24 hours before the class. Option 2: Send us a check payable to "Third Eye CSS" at the mailing Address : 5201 Great America Parkway, Suite 320, Santa Clara, CA 95054. Check must be received 24 hours before the class start time. Option 3:
Contact Information
Training Department training@thirdeyecss.com (408) 290-9949 – Ext 3

Apache Cassandra – Data Modeling Concepts
FEATURED / Technology Series
Apache HBase Developer – Architecture, Design & Implementation
Learn the HBase world NOSQL database built on top of Hadoop from architecture, design considerations, modeling and development perspective. HBase is used when random, realtime read/write access to Big Data set. We will go over various real life scenarios in this class. We will go over how HBase provides linear and modular scalability along with consistent reads and writes. We will walk you through java APIs. Overview of Block Cache and Bloom filers and it's need in real-time queries would be covered. We will do in-depth lab work to learn various skills to work around this powerful Big Data Analytics tool.Lab Work
We will work with a HBase cluster, look at configurations & load & query data. We will start with creating table and move our learning journey with hands on practicals to mid-advanced level.
Prerequisites
Developers with Java knowledge and Hadoop, MapReduce knowledge.
Audience
Developers, Data Analytics professionals, Business Analysts, Managers
Recommended Readings
- Hbase Architecture
Course Duration:
A whole day Weekend Class: 10am to 5pm
Location:
Third Eye's offices at 2900 Gordon Ave, Suite 100-20 Santa Clara, CA 95051
Registration:
Location:
2900 Gordon Ave, Suite 100-20 Santa Clara, CA 95051
Registration:
Please contact us at info@bdcuniversity.com
Contact Information:
info@bdcuniversity.com 408-256-3282





