HDP Developer: Enterprise Apache Spark 1

HDP Developer: Enterprise Apache Spark 1

This course is designed as an entry point for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Spark.

About This Course

Overview:
This course is designed as an entry point for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Spark.  Topics include: An overview of the Hortonworks Data Platform (HDP), including HDFS and YARN; using Spark Core APIs for interactive data exploration; Spark SQL and DataFrame operations; Spark Streaming and DStream operations; data visualization, reporting, and collaboration; performance monitoring and tuning; building and deploying Spark applications; and an introduction to the Spark Machine Learning Library. 

Target Audience:
Software engineers that are looking to develop in-memory applications for time sensitive and highly iterative applications in an Enterprise HDP environment.

Prerequisites:
Students should be familiar with programming principles and have previous experience in software development using either Python or Scala.  Previous experience with data streaming, SQL, and HDP is also helpful, but not required.

Format:
Live Instructor Lecture and Labs

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account

HDP Developer: Enterprise Apache Spark 1 - Live Training Schedule

Event Date Spaces left
HDP Developer: Enterprise Apache Spark 1 (Virtual) - 4 Feb. 5, 2018, 10 a.m. -
Feb. 8, 2018, 6 p.m. EST
25
HDP Developer: Enterprise Apache Spark 1 (Virtual) - 4 March 19, 2018, 10 a.m. -
March 22, 2018, 6 p.m. EDT
25

Curriculum

  • Course Logistics
  • HDP Developer: Enterprise Apache Spark 1 - Live Training Schedule
  • Lesson 1:
  • HDP Overview for Developers
  • Lab Guide: Pre-lab Set Up
  • Lab Guide: Using HDFS Commands
  • Lesson 2:
  • Overview of Zeppelin and Spark
  • Lab Guide: Introduction to Spark REPLs and Zeppelin
  • Lesson 3:
  • Working with RDDs
  • Lab Guide: Create and Manipulate RDDs (Scala)
  • Lab Guide: Create and Manipulate RDDs (Python)
  • Lesson 4:
  • Pair RDDs
  • Lab Guide: Create and Manipulate Pair RDDs (Scala)
  • Lab Guide: Create and Manipulate Pair RDDs (Python)
  • Lesson 5:
  • Spark Streaming
  • Lab Guide: Basic Spark Streaming (Scala)
  • Lab Guide: Basic Spark Streaming (Python)
  • Lab Guide: Basic Spark Streaming Transformations (Scala)
  • Lab Guide: Basic Spark Streaming Transformations (Python)
  • Lab Guide: Spark Streaming Windows Transformations (Scala)
  • Lab Guide: Spark Streaming Windows Transformations (Python)
  • Lesson 6:
  • Spark SQL
  • Lab Guide: Create and Save DataFrames (Scala)
  • Lab Guide: Create and Save DataFrames (Python)
  • Lab Guide: Working with Tables and_DataFrames_Scala.pdf
  • Lab Guide: Working with Tables and DataFrames (Python)
  • Lesson 7:
  • Data Visualization with Zeppelin
  • Lab Guide: Data Visualization Reporting and Collaboration using Zeppelin (Scala)
  • Lab Guide: Data Visualization Reporting and Collaboration using Zeppelin (Python)
  • Lesson 8:
  • Job Monitoring
  • Lab Guide: Job Monitoring (Scala)
  • Lab Guide: Job Monitoring (Python)
  • Lesson 9:
  • Performance Tuning
  • Lab Guide: Peformance Tuning (Scala)
  • Lab Guide: Peformance Tuning (Python)
  • Lesson 10:
  • Build and Submit Spark Applications
  • Lab Guide: Build and Submit Applications to YARN (Scala)
  • Lab Guide: Build and Submit Applications to YARN (Python)
  • Lesson 11:
  • Introduction to Machine Learning with Spark
  • Lab Guide: Machine Learning Walkthrough
  • Wrapping Up
  • Course & Instructor Survey

About This Course

Overview:
This course is designed as an entry point for developers who need to create applications to analyze Big Data stored in Apache Hadoop using Spark.  Topics include: An overview of the Hortonworks Data Platform (HDP), including HDFS and YARN; using Spark Core APIs for interactive data exploration; Spark SQL and DataFrame operations; Spark Streaming and DStream operations; data visualization, reporting, and collaboration; performance monitoring and tuning; building and deploying Spark applications; and an introduction to the Spark Machine Learning Library. 

Target Audience:
Software engineers that are looking to develop in-memory applications for time sensitive and highly iterative applications in an Enterprise HDP environment.

Prerequisites:
Students should be familiar with programming principles and have previous experience in software development using either Python or Scala.  Previous experience with data streaming, SQL, and HDP is also helpful, but not required.

Format:
Live Instructor Lecture and Labs

 

How to Register:

  1. Click the  "Purchase" button at the top of the page to initiate your purchase
  2. After you have completed your purchase and registration, you will be able to select the event that you wish to attend from the classes scheduled below after logging into your account

Live events

HDP Developer: Enterprise Apache Spark 1 - Live Training Schedule

Event Date Spaces left
HDP Developer: Enterprise Apache Spark 1 (Virtual) - 4 Feb. 5, 2018, 10 a.m. -
Feb. 8, 2018, 6 p.m. EST
25
HDP Developer: Enterprise Apache Spark 1 (Virtual) - 4 March 19, 2018, 10 a.m. -
March 22, 2018, 6 p.m. EDT
25

Curriculum

  • Course Logistics
  • HDP Developer: Enterprise Apache Spark 1 - Live Training Schedule
  • Lesson 1:
  • HDP Overview for Developers
  • Lab Guide: Pre-lab Set Up
  • Lab Guide: Using HDFS Commands
  • Lesson 2:
  • Overview of Zeppelin and Spark
  • Lab Guide: Introduction to Spark REPLs and Zeppelin
  • Lesson 3:
  • Working with RDDs
  • Lab Guide: Create and Manipulate RDDs (Scala)
  • Lab Guide: Create and Manipulate RDDs (Python)
  • Lesson 4:
  • Pair RDDs
  • Lab Guide: Create and Manipulate Pair RDDs (Scala)
  • Lab Guide: Create and Manipulate Pair RDDs (Python)
  • Lesson 5:
  • Spark Streaming
  • Lab Guide: Basic Spark Streaming (Scala)
  • Lab Guide: Basic Spark Streaming (Python)
  • Lab Guide: Basic Spark Streaming Transformations (Scala)
  • Lab Guide: Basic Spark Streaming Transformations (Python)
  • Lab Guide: Spark Streaming Windows Transformations (Scala)
  • Lab Guide: Spark Streaming Windows Transformations (Python)
  • Lesson 6:
  • Spark SQL
  • Lab Guide: Create and Save DataFrames (Scala)
  • Lab Guide: Create and Save DataFrames (Python)
  • Lab Guide: Working with Tables and_DataFrames_Scala.pdf
  • Lab Guide: Working with Tables and DataFrames (Python)
  • Lesson 7:
  • Data Visualization with Zeppelin
  • Lab Guide: Data Visualization Reporting and Collaboration using Zeppelin (Scala)
  • Lab Guide: Data Visualization Reporting and Collaboration using Zeppelin (Python)
  • Lesson 8:
  • Job Monitoring
  • Lab Guide: Job Monitoring (Scala)
  • Lab Guide: Job Monitoring (Python)
  • Lesson 9:
  • Performance Tuning
  • Lab Guide: Peformance Tuning (Scala)
  • Lab Guide: Peformance Tuning (Python)
  • Lesson 10:
  • Build and Submit Spark Applications
  • Lab Guide: Build and Submit Applications to YARN (Scala)
  • Lab Guide: Build and Submit Applications to YARN (Python)
  • Lesson 11:
  • Introduction to Machine Learning with Spark
  • Lab Guide: Machine Learning Walkthrough
  • Wrapping Up
  • Course & Instructor Survey