HDP Developer: Apache Pig and Hive

HDP Developer: Apache Pig and Hive

This course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive.

About this course

Overview:
This course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition and using Pig and Hive to perform data analytics on Big Data. Labs are executed on a 7-node HDP cluster. 

Target Audience:
Software developers who need to understand and develop applications for Hadoop.

Prerequisites:
Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.

Format:
Live Instructor
50% Lecture
50% Hands-On Labs

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account

HDP Developer: Apache Pig and Hive - Live Training Schedule

Event Date Spaces left
HDP Developer: Apache Pig and Hive (Virtual) - 4 Dec. 18, 2017, 10 a.m. -
Dec. 21, 2017, 6 p.m. EST
15
HDP Developer: Apache Pig and Hive (Virtual) - 4 Feb. 12, 2018, 10 a.m. -
Feb. 15, 2018, 6 p.m. EST
25

Curriculum

  • Course Logistics
  • HDP Developer: Apache Pig and Hive - Live Training Schedule
  • Downloadable VM Setup Guide
  • AWS Guacamole Setup Guide
  • Lesson 1
  • Lesson 1: Understanding Hadoop
  • Lab Guide: Starting an HDP 2.3 Cluster
  • Lesson 2
  • Lesson 2: Introduction to the Hadoop Distributed File System (HDFS)
  • Demonstration: Understanding Block Storage
  • Lab Guide: Using HDFS Commands
  • Lesson 3
  • Lesson 3: Inputting Data Into HDFS
  • Lab Guide: Importing RDBMS Data into HDFS
  • Lab Guide: Exporting HDFS Data to an RDBMS
  • Lab Guide: Importing Log Data into HDFS using Flume
  • Lesson 4
  • Lesson 4: The MapReduce Framework
  • Demonstration: Understanding MapReduce
  • Lab Guide: Running a MapReduce Job
  • Lesson 5
  • Lesson 5: Introduction to Pig
  • Demonstration: Understanding PIG
  • Lab Guide: Getting Started with PIG
  • Lab Guide: Exploring Data with PIG
  • Lesson 6
  • Lesson 6: Advanced Pig Programming
  • Lab Guide: Splitting a Dataset
  • Lab Guide: Joining Datasets with PIG
  • Lab Guide: Preparing Data for Hive
  • Demonstration Guide: Computing PageRank
  • Lab Guide: Analyzing Clickstream Data
  • Lab Guide: Analyzing Stock Market Data using Quantiles
  • Lesson 7
  • Lesson 7: Hive Programming
  • Lab Guide: Understanding Hive Tables
  • Demonstration: Understanding Partitions and Skew
  • Lab Guide: Analyzing Big Data with Hive
  • Demonstration: Computing ngrams
  • Lab Guide: Joining Datasets in Hive
  • Lab Guide: Computing ngrams of Emails in Avro Format
  • Lesson 8
  • Lesson 8: Using HCatalog
  • Lab Guide: Using HCatalog with Pig
  • Lesson 9
  • Lesson 9: Advanced Hive Programming
  • Lab Guide: Advanced Hive Programming
  • Lesson 10
  • Lesson 10: Hadoop 2 and YARN
  • Lab Guide: Running a YARN Application
  • Lesson 11
  • Lesson 11: Introducing Apache Spark
  • Lesson 12
  • Lesson 12: Programming with Apache Spark
  • Lab Guide: Getting Started with Apache Spark
  • Lesson 13
  • Lesson 13: Spark SQL and DataFrames
  • Lab Guide: Exploring Spark SQL
  • Lesson 14
  • Lesson 14: Defining Workflow with Oozie
  • Lab Guide: Defining an Oozie Workflow
  • Wrapping Up
  • Course & Instructor Survey

About this course

Overview:
This course is designed for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Pig and Hive. Topics include: Hadoop, YARN, HDFS, MapReduce, data ingestion, workflow definition and using Pig and Hive to perform data analytics on Big Data. Labs are executed on a 7-node HDP cluster. 

Target Audience:
Software developers who need to understand and develop applications for Hadoop.

Prerequisites:
Students should be familiar with programming principles and have experience in software development. SQL knowledge is also helpful. No prior Hadoop knowledge is required.

Format:
Live Instructor
50% Lecture
50% Hands-On Labs

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account

Live events

HDP Developer: Apache Pig and Hive - Live Training Schedule

Event Date Spaces left
HDP Developer: Apache Pig and Hive (Virtual) - 4 Dec. 18, 2017, 10 a.m. -
Dec. 21, 2017, 6 p.m. EST
15
HDP Developer: Apache Pig and Hive (Virtual) - 4 Feb. 12, 2018, 10 a.m. -
Feb. 15, 2018, 6 p.m. EST
25

Curriculum

  • Course Logistics
  • HDP Developer: Apache Pig and Hive - Live Training Schedule
  • Downloadable VM Setup Guide
  • AWS Guacamole Setup Guide
  • Lesson 1
  • Lesson 1: Understanding Hadoop
  • Lab Guide: Starting an HDP 2.3 Cluster
  • Lesson 2
  • Lesson 2: Introduction to the Hadoop Distributed File System (HDFS)
  • Demonstration: Understanding Block Storage
  • Lab Guide: Using HDFS Commands
  • Lesson 3
  • Lesson 3: Inputting Data Into HDFS
  • Lab Guide: Importing RDBMS Data into HDFS
  • Lab Guide: Exporting HDFS Data to an RDBMS
  • Lab Guide: Importing Log Data into HDFS using Flume
  • Lesson 4
  • Lesson 4: The MapReduce Framework
  • Demonstration: Understanding MapReduce
  • Lab Guide: Running a MapReduce Job
  • Lesson 5
  • Lesson 5: Introduction to Pig
  • Demonstration: Understanding PIG
  • Lab Guide: Getting Started with PIG
  • Lab Guide: Exploring Data with PIG
  • Lesson 6
  • Lesson 6: Advanced Pig Programming
  • Lab Guide: Splitting a Dataset
  • Lab Guide: Joining Datasets with PIG
  • Lab Guide: Preparing Data for Hive
  • Demonstration Guide: Computing PageRank
  • Lab Guide: Analyzing Clickstream Data
  • Lab Guide: Analyzing Stock Market Data using Quantiles
  • Lesson 7
  • Lesson 7: Hive Programming
  • Lab Guide: Understanding Hive Tables
  • Demonstration: Understanding Partitions and Skew
  • Lab Guide: Analyzing Big Data with Hive
  • Demonstration: Computing ngrams
  • Lab Guide: Joining Datasets in Hive
  • Lab Guide: Computing ngrams of Emails in Avro Format
  • Lesson 8
  • Lesson 8: Using HCatalog
  • Lab Guide: Using HCatalog with Pig
  • Lesson 9
  • Lesson 9: Advanced Hive Programming
  • Lab Guide: Advanced Hive Programming
  • Lesson 10
  • Lesson 10: Hadoop 2 and YARN
  • Lab Guide: Running a YARN Application
  • Lesson 11
  • Lesson 11: Introducing Apache Spark
  • Lesson 12
  • Lesson 12: Programming with Apache Spark
  • Lab Guide: Getting Started with Apache Spark
  • Lesson 13
  • Lesson 13: Spark SQL and DataFrames
  • Lab Guide: Exploring Spark SQL
  • Lesson 14
  • Lesson 14: Defining Workflow with Oozie
  • Lab Guide: Defining an Oozie Workflow
  • Wrapping Up
  • Course & Instructor Survey