Apache Spark Developer Training

Overview

Apache Spark developer training is a 5 day course on Apache Spark aimed at experienced programmers who want to explore next generation big data framework. Put your existing skills to solve problems in big data using Apache Spark.

Course takes a practitioner approach to content. It will teach you from basic syntax and semantics to advanced problem solving using Spark. We focus heavily on hands on to make sure each and every concept is clear to you. We stress on building distributed mind set from day one. Each section has a practical focus, mixing presentation with in-depth hands-on labs and exercises.

If you are experienced developer who want to take first steps in Big data or an experienced hadoop developer who want to widen his horizon, this course is right for you.

Prerequisites

  • Programming experience in Java.
  • Basic familiarity with Hadoop is highly recommended.
  • Prior knowledge of Spark or Scala is not required.

Course content

  • Introduction
    • Why Second generation frameworks?
    • Introduction to Spark
    • Spark Architecture
    • Spark on Cluster
  • Scala session
    • Why Scala?
    • Hands on Scala features
    • Type inference
    • Higher order functions
    • Collections and Combinators
    • Lazy evaluation
    • implicit
  • Spark API Hands on
    • RDD
    • map, flatMap, filter
    • Hadoop RDD
    • Pair RDD
    • Double RDD
    • Caching
    • Join
  • Advanced Spark operation
    • Aggregate
    • fold
    • mapParititions
    • glom
    • Accumulators
    • Broadcasters
  • Anatomy of a spark RDD
    • Splits
    • Localization
    • Serialization
    • Transformations vs Actions
  • Integration with HDFS
  • Shark and other ecosystem projects
Course Summary
Length
5 days

Audience

Developers with hadoop experience looking for understanding Apache Spark and its ecosystem.

Frameworks covered

  • Spark
  • Spark Streaming
  • SparkQL
  • Spark on YARN
  • MLLib
  • Graphax
Public and Corporate courses

We run public courses every few months. Public courses are open to all. Get in touch to find out when the next course is scheduled.

Corporate courses are run exclusively for your team. You have complete control over the syllabus and schedule. Let us know your custom requirements on the booking form and we'll customize the material appropriately.