Freigeben über


HDInsight -New self-paced trainings and labs on Hadoop, Hive, HBase, Spark & Storm

cross post from https://blogs.msdn.microsoft.com/azuredatalake/2016/08/28/hdinsight-new-self-paced-trainings-and-labs/

This week Microsoft Learning Experiences released/updated 3 HDInsight courses ( These are free , $49 if you need a course Certificate)

Create HDInsight cluster

Processing Big Data with Azure HDInsight

Start course

More and more organizations are taking on the challenge of analyzing big data. This course teaches you how to use the Hadoop technologies in Microsoft Azure HDInsight to build batch processing solutions that cleanse and reshape data for analysis. In this five-week course, you’ll learn how to use technologies like Hive, Pig, Oozie, and Sqoop with Hadoop in HDInsight; and how to work with HDInsight clusters from Windows, Linux, and Mac OSX client computers.

Course Syllabus

Module 1: Getting Started with HDInsight
The course begins with an introduction to big data concepts and Hadoop, before examining Microsoft Azure HDInsight and the Hadoop distribution it provides. You’ll learn how to provision an HDInsight cluster, upload data to the cluster, and perform Map/Reduce jobs that process the data.

Module 2: Processing Big Data with Hive
The second week of the course is all about Hive. You’ll learn how to create Hive tables and use HiveQL to query them, before exploring some advanced Hive techniques like partitioning and indexing.

Module 3: Going Beyond Hive
In the third week of the course, you’ll learn how to use Pig to process big data, and how to extend the capabilities of Pig and Hive by using user-defined functions implemented in Python.

Module 4: Building a Big Data Workflow
Week four builds on the data processing techniques covered in previous weeks, and teaches you how to build an end-to-end big data processing workflow using Oozie and Sqoop.

Final Exam
The fifth week of the course is given over the final exam. You must achieve a score of 50% or higher to pass this course and earn a certificate.

Implementing Real-Time Analysis with Hadoop in Azure HDInsight

Start course

In this four week course, you’ll learn how to implement low-latency and streaming Big Data solutions using Hadoop technologies like HBase, Storm, and Spark on Microsoft Azure HDInsight.

Course Syllabus

Use HBase to implement low-latency NoSQL data stores.
Use Storm to implement real-time streaming analytics solutions.
Use Spark for high-performance interactive data analysis.

Implementing Predictive Solutions with Spark in Azure HDInsight

Start course

In this course, learn how to implement predictive analytics solutions for big data using Apache Spark in Microsoft Azure HDInsight. You will learn how to work with Scala or Python to cleanse and transform data, build machine learning models with Spark MLlib (the machine learning library in Spark), and create real-time machine learning solutions using Spark Streaming. Plus, find out how to use R Server on Spark to work with data at scale in the R language

Course Syllabus

Using Spark to work with data
Preprocessing data for machine learning in Spark
Building machine learning models in Spark
Using R at scale with R Server on Spark.

Credit :Graeme Malcolm