We use cookies to give you the best online experience. By using our website you agree to our use of cookies in accordance with our cookie policy.

Please sign in to see the pricing and purchase.
Machine Learning with Apache Spark WA2610 - Course Book product photo Front View EL

Delivery Information:

You will receive required software set up for install 48 hours from time of purchase. 

Version 1.0

Product Type: Courseware
Level: Foundation
Duration: 1 Days

Participants should have the general knowledge of statistics and programming

Language: English (en-US)
Delivery Format: eBook

Delivery Information

Delivered as a voucher. You can access the vouchers and assign them from Active Vouchers on myLeapest or you can use Classes function to assign the vouchers to a group of learners.

Product Content

This product contains the following items. Upon purchasing, you will get access to all available version prior to the latest version.

Course Description :

Delivery Information:

You will receive required software set up for install 48 hours from time of purchase. 

Version 1.0

Course Outline :

Chapter 1.

  • Machine Learning Algorithms
  • Supervised vs Unsupervised Machine Learning
  • Supervised Machine Learning Algorithms
  • Unsupervised Machine Learning Algorithms
  • Choose the Right Algorithm
  • Life-cycles of Machine Learning Development
  • Classifying with k-Nearest Neighbors (SL)k-Nearest Neighbors Algorithmk-Nearest Neighbors Algorithm
  • The Error Rate
  • Decision Trees (SL)Random Forests
  • Unsupervised Learning Type: ClusteringK-Means Clustering (UL)K-Means Clustering in a Nutshell
  • Regression Analysis
  • Logistic Regression
  • Summary

Chapter 2.

  • Introduction to Functional Programming
  • What is Functional Programming (FP)?
  • Terminology: Higher-Order Functions
  • Terminology: Lambda vs Closure
  • A Short List of Languages that Support FPFP with JavaFP With JavaScript
  • Imperative Programming in JavaScript
  • The JavaScript map (FP) Example
  • The JavaScript reduce (FP) Example
  • Using reduce to Flatten an Array of Arrays (FP) Example
  • The JavaScript filter (FP) Example
  • Common High-Order Functions in Python
  • Common High-Order Functions in Scala
  • Elements of FP in R
  • Summary

Chapter 3.

  • Introduction to Apache Spark
  • What is Apache Spark
  • A Short History of Spark
  • Where to Get Spark?The Spark Platform
  • Spark Logo
  • Common Spark Use Cases
  • Languages Supported by Spark
  • Running Spark on a Cluster
  • The Driver Process
  • Spark Applications
  • Spark Shell
  • The spark-submit Tool
  • The spark-submit Tool Configuration
  • The Executor and Worker Processes
  • The Spark Application Architecture
  • Interfaces with Data Storage Systems
  • Limitations of Hadoop's MapReduce
  • Spark vs MapReduce
  • Spark as an Alternative to Apache Tez
  • The Resilient Distributed Dataset (RDD)
  • Spark Streaming (Micro-batching)Spark SQL
  • Example of Spark SQLSpark Machine Learning Library
  • GraphXSpark vs R
  • Summary

Chapter 4.

  • The Spark Shell
  • The Spark Shell UI
  • Spark Shell Options
  • Getting Help
  • The Spark Context (sc) and SQL Context (sqlContext)
  • The Shell Spark Context
  • Loading Files
  • Saving Files
  • Basic Spark ETL Operations
  • Summary

Chapter 5.

  • Spark Machine Learning Library
  • What is MLlib?
  • Supported Languages
  • MLlib Packages
  • Dense and Sparse Vectors
  • Labeled Point
  • Python Example of Using the Labeled
  • Point Class
  • LIBSVM format
  • An Example of a LIBSVM File
  • Loading LIBSVM Files
  • Local Matrices
  • Example of Creating Matrices in MLlib
  • Distributed Matrices
  • Example of Using a Distributed Matrix
  • Classification and Regression Algorithm
  • Clustering
  • Summary

Chapter 6.

  • Text Mining
  • What is Text Mining?
  • The Common Text Mining Tasks
  • What is Natural Language Processing (NLP)?
  • Some of the NLP Use Cases
  • Machine Learning in Text Mining and NLP
  • Machine Learning in NLPTF-IDF
  • The Feature Hashing Trick
  • Stemming
  • Example of Stemming
  • Stop Words
  • Popular Text Mining and NLP Libraries and Packages
  • Summary
  • Lab Exercises
  • Lab 1. Learning the Lab Environment
  • Lab 2. The Spark Shell 
  • Lab 3. Using Random Forests for Classification with Spark MLlib 
  • Lab 4. Using k-means Algorithm from MLlib
  • Lab 5. Text Classification with Spark ML Pipeline

Target Audience :

Data Scientists, Business Analysts, Software Developers, IT Architects

Course Agenda :

  • Applied Data Science and Business Analytics
  • Machine Learning Algorithms, Techniques and Common Analytical Methods
  • Apache Spark Introduction
  • Spark’s MLlib Machine Learning Library

This Apache Spark training course has 3 hands-on labs that are outlined at the bottom of this page. The labs cover the spark-submit tool as well as Apache Spark shell. The labs allow you to practice the following skills:

  • Lab 1 - Using the spark-submit ToolSpark offers developers two ways of running your applications:Using the spark-submit toolUsing Spark ShellIn this lab, we will review what is involved in using the spark-submit tool.

  • Lab 2 - The Apache Spark ShellInteractive development environment in Spark is provided by the Spark Shell (also known as REPL: Read/Eval/Print Loop tool) that is available for Scala and Python developers (Java is not yet supported).The lab instructions below apply to the Scala version of the Spark Shell.

  • Lab 3 - Using Random Forests for Classification with Spark MLlibIn this lab, we will learn how to use Random Forests implementation of the algorithm from Spark's Machine Learning library, MLlib, to perform object classification.Random Forests algorithm is regarded as one of the most successful supervised learning algorithm that can be used for both classification and regression. In our work we will use the Python version of the library, which provides API similar to those implemented in Scala and Java.We will also use the spark-submit Spark tool to submit the application from command line rather than typing in commands in Spark Shell.

Web Age Spark class can be delivered in traditional classroom style format. This Apache Spark Training can also be delivered in a synchronous instructor led format.

Machine Learning with Apache Spark WA2610 - Course Book

(0) No ratings yet
Sold by:
Add to Quote Request