We use cookies to give you the best online experience. By using our website you agree to our use of cookies in accordance with our cookie policy.

Please sign in to see the pricing and purchase.
Hadoop Programming on the Hortonworks Data Platform for Managers WA2622 - Course Book product photo Front View EL

Delivery Information:

You will receive required software set up for install 48 hours from time of purchase. 

Version 1.0

Product Type: Courseware
Level: Foundation
Duration: 2 Days

Participants should have the general knowledge of programming.

Language: English (en-US)
Delivery Format: eBook

Delivery Information

Delivered as a voucher. You can access the vouchers and assign them from Active Vouchers on myLeapest or you can use Classes function to assign the vouchers to a group of learners.

Product Content

This product contains the following items. Upon purchasing, you will get access to all available version prior to the latest version.

Course Description :

Delivery Information:

You will receive required software set up for install 48 hours from time of purchase. 

Version 1.0

Course Outline :

Chapter 1.

  • MapReduce Overview
  • The Client – Server Processing Pattern
  • Distributed Computing Challenges
  • MapReduce Defined
  • Google's MapReduce
  • The Map Phase of MapReduce
  • The Reduce Phase of MapReduce
  • MapReduce Explained
  • MapReduce Word Count Job
  • MapReduce Shared-Nothing Architecture
  • Similarity with SQL Aggregation Operations
  • Example of Map & Reduce Operations using JavaScript
  • Problems Suitable for Solving with MapReduce
  • Typical MapReduce Jobs
  • Fault-tolerance of MapReduce
  • Distributed Computing Economics
  • MapReduce Systems
  • Summary

Chapter 2.

  • Hadoop Overview
  • Apache Hadoop
  • Apache Hadoop Logo
  • Typical Hadoop Applications
  • Hadoop Clusters
  • Hadoop Design Principles
  • Hadoop Versions
  • Hadoop's Main Components
  • Hadoop Simple Definition
  • Side-by-Side Comparison: Hadoop 1 and Hadoop 2
  • Hadoop-based Systems for Data Analysis
  • Other Hadoop Ecosystem Projects
  • Hadoop Caveats
  • Hadoop Distributions
  • Cloudera Distribution of Hadoop (CDH)
  • Cloudera Distributions
  • Hortonworks Data Platform (HDP)
  • MapR
  • Summary

Chapter 3.

  • Hadoop Distributed File System Overview
  • Hadoop Distributed File System (HDFS)
  • HDFS High Availability
  • HDFS 'Fine Print'Storing
  • Raw Data in HDFS
  • Hadoop Security
  • HDFS Rack-awareness
  • Data Blocks
  • Data Block Replication Example
  • HDFS Name
  • Node Directory Diagram
  • Accessing HDFS
  • Examples of HDFS Commands
  • Other Supported File Systems
  • WebHDFS
  • Examples of WebHDFS Calls
  • Client Interactions with HDFS for the Read Operation
  • Read Operation Sequence Diagram
  • Client Interactions with HDFS for the Write Operation
  • Communication inside HDFS
  • Summary

Chapter 4.

  • Apache Pig Scripting Platform
  • What is Pig?Pig Latin
  • Apache Pig Logo
  • Pig Execution Modes
  • Local Execution Mode
  • MapReduce Execution Mode
  • Running Pig
  • Running Pig in Batch Mode
  • What is Grunt?Pig Latin Statements
  • Pig Programs
  • Pig Latin Script Example
  • SQL Equivalent
  • Differences between Pig and SQL
  • Statement Processing in Pig
  • Comments in Pig
  • Supported Simple Data Types
  • Supported Complex Data Types
  • Arrays
  • Defining Relation's Schema
  • Not Matching the Defined Schema
  • The bytearray Generic Type
  • Using Field Delimiters
  • Loading Data with Text
  • Loader()Referencing Fields in Relations
  • Summary

Chapter 5.

  • Apache Pig HDFS Interface
  • The HDFS InterfaceFS
  • Shell Commands (Short List)
  • Grunt's Old File System Commands
  • Summary

Chapter 6.

  • Apache Pig Relational and Eval Operators
  • Pig Relational Operators
  • Example of Using the JOIN Operator
  • Example of Using the Order By Operator
  • Caveats of Using Relational Operators
  • Pig Eval Functions
  • Caveats of Using Eval Functions (Operators)
  • Example of Using Single-column Eval Operations
  • Example of Using Eval Operators For Global Operations
  • Summary

Chapter 7.

  • HiveWhat is Hive?
  • Apache Hive Logo
  • Hive's Value Proposition
  • Who uses Hive?Hive's Main Sub-Systems
  • Hive Features
  • The 'Classic' Hive Architecture
  • The New Hive Architecture
  • HiveQL
  • Where are the Hive Tables Located?
  • Hive Command-line Interface (CLI)
  • The Beeline Command Shell
  • Summary

Chapter 8.

  • Hive Command-line Interface
  • Hive Command-line Interface (CLI)
  • The Hive Interactive Shell
  • Running Host OS Commands from the Hive Shell
  • Interfacing with HDFS from the Hive Shell
  • The Hive in Unattended Mode
  • The Hive CLI Integration with the OS Shell
  • Executing HiveQL Scripts
  • Comments in Hive Scripts
  • Variables and Properties in Hive CLI
  • Setting Properties in CLI
  • Example of Setting Properties in CLI
  • Hive Namespaces
  • Using the SET Command
  • Setting Properties in the Shell
  • Setting Properties for the New Shell Session
  • Setting Alternative Hive Execution Engines
  • The Beeline Shell
  • Connecting to the Hive Server in Beeline
  • Beeline Command Switches
  • Beeline Internal Commands
  • Summary

Chapter 9.

  • Hive Data Definition Language
  • Hive Data Definition Language
  • Creating Databases in Hive
  • Using Databases
  • Creating Tables in Hive
  • Supported Data Type Categories
  • Common Numeric Types
  • String and Date / Time Types
  • Miscellaneous Types
  • Example of the CREATE TABLE Statement
  • Working with Complex Types
  • Table Partitioning
  • Table Partitioning
  • Table Partitioning on Multiple Columns
  • Viewing Table Partitions
  • Row Format
  • Data Serializers / Deserializers
  • File Format Storage
  • File Compression
  • More on File Formats
  • The ORC Data Format
  • Converting Text to ORC Data Format
  • The EXTERNAL DDL Parameter
  • Example of Using EXTERNAL
  • Creating an Empty Table
  • Dropping a Table
  • Table / Partition(s) Truncation
  • Alter Table/Partition/Column
  • Views
  • Create View Statement
  • Why Use Views?Restricting Amount of Viewable Data
  • Examples of Restricting Amount of Viewable Data
  • Creating and Dropping Indexes
  • Describing Data
  • Summary

Target Audience :

Managers, Business Analysts, and IT Architects.

Hadoop Programming on the Hortonworks Data Platform for Managers WA2622 - Course Book

(0) No ratings yet
Sold by:
Add to Quote Request