Hadoop Professional Course

Hadoop On-Demand Training offers full-length courses on a range of Hadoop technologies for developers, data analysts and administrators. Designed in a format that meets your convenience, availability and flexibility needs, these courses will lead you on the path to becoming a certified Hadoop professional. Read our FAQ for more details.

Free 24x7x365 access. Take online Hadoop courses anytime, from anywhere in the world. Refresh your knowledge, whenever or wherever you want.
In-depth online Hadoop curriculum covers MapReduce, HBase, Hive, and Apache Drill. Learn with interactive labs and quizzes.
Get certified as a Hadoop expert. Add value to your current and future employers – and put your newly acquired skills into action right away.
Stay at the forefront of Hadoop & big data Maintain your competitive edge in a data-hungry world.

HDE 100 - Hadoop Essentials

Register
This is an introductory level course about big data, Hadoop and the Hadoop ecosystem of products. Covered are a big data definition, details about the Hadoop core components, and examples of several common Hadoop use cases: enterprise data hub, large scale log analysis, and building recommendation engines.

Learn More
HDE 110 - MapR Distribution Essentials

Register
This course is an introduction to the features of the MapR Distribution including Hadoop. Topics include the basic architectural components of the MapR file system (MapR-FS), and information on how this architecture overcomes the limitations of the Hadoop Distributed File System (HDFS). You will also learn the basics of designing MapR-DB tables and how to migrate between HBase and MapR-DB tables. At end of the course, you will be able to describe the components of MapR-FS, compare and contrast HDFS to MapR-FS, and describe the architectural advantages of MapR-DB.

Learn More
ADM 200 - Cluster Administration: Install a MapR Cluster

Register
This is the first course in the Cluster Administration curriculum. This course covers pre-installation testing and verification, installing a MapR cluster, and performing post-installation benchmarking.

Learn More
ADM 201 - Cluster Administration: Configure a MapR Cluster

Register
ADM 201 is the second course in the Cluster Administration curriculum. This course covers how to configure the cluster’s storage resources once the cluster has been installed.

Learn More
ADM 202 - Cluster Administration: Data Access and Protection

Register
ADM 202 is the third course in the Cluster Administration curriculum. This course defines methods for data ingestion, and covers the use of snapshots and mirrors.

Learn More
ADM 203 - Cluster Administration: Cluster Maintenance

Register
This is the fourth and final course in the Cluster Administration curriculum. This course teaches you how to configure cluster settings, monitor the cluster, resolve issues, and optimize cluster performance.

Learn More
DEV 301 - Developing Hadoop Applications

Register
This course teaches developers, with lectures and hands-on lab exercises, how to write Hadoop Applications using MapReduce and YARN in Java. The course extensively covers MapReduce programming, debugging, managing jobs, improving performance, working with custom data, managing workflows, and using other programming languages for MapReduce.

Learn More
DEV 320 - HBase Data Model and Architecture

Register
This course is intended for data analysts, data architects and application developers. DEV 320 provides you with a thorough understanding of the HBase data model and architecture, which is required before going on to designing HBase schemas and developing HBase applications.

Learn More
DEV 325 - HBase Schema Design

Register
Targeted towards data analysts, data architects and application developers, the goal of this course is to enable you to design HBase schemas based on design guidelines. You will learn about the various elements of schema design and how to design for data access patterns. The course offers an in-depth look at designing row keys, avoiding hot-spotting and designing column families. It discusses how to transition from a relational model to an HBase model. You will learn the differences between tall tables and wide tables. Concepts are conveyed through lectures, hands-on labs and analysis of scenarios.

Learn More
DEV 330 - Developing HBase Applications: Basics

Register
Targeted towards data architects and application developers who have experience with Java, the goal of this course is to learn how to write HBase programs using Hadoop as a distributed NoSQL datastore.

Learn More
DEV 335 - Developing HBase Applications: Advanced

Register
Targeted towards data architects and application developers who have experience with Java, the goal of this series of courses is to learn how to write HBase programs using Hadoop as a distributed NoSQL datastore. This course builds on DEV 320 and 325 - HBase Data Model and Schema Design. This is a continuation of DEV 330 - Developing HBase Applications: Basics.

Learn More
DEV 340 - Apache HBase Applications: Bulk Loading, Performance & Security

Register

Targeted towards data analysts, data architects, and application developers, the goal of this course is to learn more about architecting your Apache HBase applications for performance and security. This course covers how to bulk load data into HBase, performance considerations and tips for designing your HBase application, benchmarking and monitoring your HBase application, and MapR-DB security. Concepts are conveyed through lectures, hands-on labs, and scenario analyses.
Learn More
DEV 350 - MapR Streams Essentials

Register

This introductory-level course teaches the core concepts necessary to understand and begin using MapR Streams to develop big data processing applications.
Learn More
DEV 351 – Developing MapR Streams Applications

Register

This course is targeted towards developers and administrators to give them the core concepts necessary to build simple MapR Streams applications.
Learn More
DEV 360 - Apache Spark Essentials

Register
This introductory course enables developers to get started developing big data applications with Apache Spark. In the first part of the course, you will use Spark’s interactive shell to load and inspect data. The course then describes the various modes for launching a Spark application. You will then go on to build and launch a standalone Spark application.

Learn More
DEV 361 - Build and Monitor Apache Spark Applications

Register
This course is the second in the Apache Spark series. You will learn to create and modify pair RDDs, perform aggregations, and control the layout of pair RDDs across nodes with data partitioning. This course also discusses Spark SQL and DataFrames, the programming abstraction of Spark SQL. This course also describes the components of the Spark execution model using the Spark Web UI to monitor Spark applications.

Learn More
DEV 362 - Create Data Pipeline Applications Using Apache Spark

Register

This course is the third in the Apache Spark series. In this course, you cover the following Apache Spark libraries - Spark Streaming, Spark SQL, Spark MLlib, and Spark GraphX. This course describes the benefits of the Apache Spark unified platform and how to build a data pipeline application using Spark Streaming, Spark SQL, Spark GraphX, and MLlib. The concepts are taught using scenarios in Scala that also form the basis of hands-on labs.
Learn More
DA 410 - Apache Drill Essentials

Register
This introductory Apache Drill course, targeted at Data Analysts, Scientists and SQL programmers, covers how to use Drill to explore known or unknown data without writing code. You will write SQL queries on a variety of data types including structured data in a Hive table, semi-structured data in HBase or MapR-DB, and complex data file types, such as Parquet and JSON.

Learn More
DA 415 - Apache Drill Architecture

Register
DA 415 is an intermediate level course designed for data analysts, developers, and systems administrators. It is a continuation of DA 410 - Apache Drill Essentials, and describes how a query is received and executed by Drill. You will learn the different services involved at each step, and how Drill optimizes a query for distributed SQL execution.

Learn More
DA 440 - Apache Hive Essentials

Register
DA 440 is an introductory-level course designed for data analysts and developers. You will learn how Apache Hive fits in the Hadoop ecosystem, how to create and load tables in Hive, and how to query data using the Hive Query Language.

Learn More
DA 450 - Apache Pig Essentials

Register
DA 450 - Apache Pig Essentials is an introductory-level course designed for data analysts and developers. The course begins with a review of data pipeline tools, then covers how to load and manipulate relations in Pig.

Learn More
MCHA - MapR Certified Hadoop Administrator

Register CERTIFICATION EXAM
This certification exam measures technical knowledge, skill, and ability to configure, deploy, maintain, and secure a Hadoop cluster. This exam covers the architecture of a Hadoop cluster, planning and preparing the nodes, data ingestion, disaster recovery, availability, management and monitoring.

Learn More
MCHBD - MapR Certified HBase Developer

Register CERTIFICATION EXAM
This certification exam measures and validates the technical knowledge, skills and abilities required to write HBase programs using HBase as a distributed NoSQL datastore. This exam covers HBase architecture, the HBase data model, APIs, schema design, performance tuning, bulk-loading of data, and storing complex data structures.

Learn More
MCHD - MapR Certified Hadoop Developer

Register CERTIFICATION EXAM
This certification exam measures the specific technical knowledge, skills and abilities required to design and develop MapReduce programs in Java. This exam covers writing MapReduce programs, using MapReduce API, managing, monitoring and testing MapReduce programs and workflows.

Learn More
MCSD - MapR Certified Spark Developer

Register
CERTIFICATION EXAM

The MapR Certified Spark Developer credential is designed for Engineers, Programmers, and Developers who prepare and process large amounts of data using Spark. The certification tests your ability to use Spark in a production environment; where coding knowledge is tested, we lean toward the use of Scala for our code samples.