How to use Machine Learning Service in Windows Azure Account and Run a Linear Regression Model on the Azure Cloud

Hi Everyone,

This blog caters to the beginner level training of using Machine Learning Cloud Service provided by Microsoft. This Blog will run Linear regression using the data from an Azure Table (Present in the Azure SQL Database – the sample database used is “AdventureWorks2012”). You can download the sample Azure database used is “AdventureWorks2012”) https://msftdbprodsamples.codeplex.com/releases/view/37304

The Concept is use a data mining algorithm on the Azure Cloud using AdventureWorks2012 Azure database Table,

Train the Model, Score the Model, Evaluate it and use it to predict the Value for a given input

For simplicity purpose I am using only one table SalesHeader Table, The Order Dates present are from 1st July 2005 to 31st July 2008, the Concept is to use these Order Dates, along with Order Amounts in a linear regression Model and predict the Possible Order Amount for 1st Aug 2008.This is not a Time series Model, this is a training exercise of to use a linear Regression Model.For Any Regression Model to work, we will need split the data into training data and testing data and Later Train the model with the training data and based on the training data, it will create linear equation which will scored and evaluated over the testing data and based on the evaluation, it will predict the next item (in our case future possible date 1st Aug 2014 ).

It is assumed that the User has an Azure Account to run the machine learning Experiment. Follow the below steps to run the Linear regression Model on the Cloud. Login into the with your Live account to (https://manage.windowsazure.com/ )After Logging in Click on New at the left bottom corner

Navigate to Data Services ->Machine Learning (Preview) -> Quick Create

Create a new Workspace with a name and give your live Account ID in the Workspace Owner, Provide a Storage account name (Make sure the Storage account name is combination of only in small letter and numbers). Please view a sample screenshot

In my case , WorkSpace Name is "Tutorials" and Storage “SampleTutorialstorage”

 

This action will create a new ML Workspace

 Now click on Machine Learning, Go to the specified ML workspace name (In this case “Tutorials”) and click on “Sign-in to ML Studio”

 After signing in, click on the “NEW” Left Corner of the Window and create a new experiment

To Start the Experiment , we need a Data Source, in Machine Learning we can use Reader from Data Input.

So, Type “Reader” in the Search text box, you can drag the Reader icon in the workspace

Select Properties of the Reader on the Ride Side of the Browser.
Select Reader change the properties on the right side with the following values
Please specify data source: SqlAzure
Database server name: Your SQL Azure Database
Database name: AdventureWorks2012
Server user account name: Your Azure SQL User Account
Server user Account Password: Your Azure SQL Password
Type the following in the Database Query
Select DateDiff(dd, '2005-06-30' ,OrderDate) as DiffDate, OrderAmt from
(
select Convert(date ,OrderDate) as OrderDate,convert(int,Sum(TotalDue)) as OrderAmt from Sales.SalesOrderHeader group by OrderDate
) A order by A.Orderdate

Now search for Split Icon and drag the same into the workspace

 

 
Now link the Reader to the Split and change the Properties for the Fraction of Rows in the first Output to "0.75"
We will be splitting the rows with 75% percent into one training set and 25% into another testing set

Now search for linear regression icon in the Search text box and drag the linear regression into Experiment Workspace.
Now search for Train Model icon in the Search text box and drag the Train Model into the Experiment Workspace.


You will notice a Error Pointer for Train Model, this is because the Model has not provided with the column which is needed to be predicted.
Click the Train Model Properties and click the Launch Column Selector and type “OrderAmt” in the include Single Column Section and click the TICK mark as the Column header in the specified Database Query is OrderAmt

  Now Search for Score Model in the Search Text box and add into the data model and join the Train Model output to Left input of the Score Model and join the second output of Split Model into the Right input of the Score Model.

Now Search for Evaluate Model in the Search Text-box and add into the data model and join the Score Model output to the input of the Evaluate Model.

 On the Bottom Panel Click “SAVE AS” and rename the Experiment to “Sample Linear Regression Experiment” and after saving the Experiment, Click Run to execute the same

After Successful Execution, click the Right connector of the Score Model as “Set as Publish Output”

  Now Click the Output of Evaluate Model as “Set as Publish Output”

 this will make model to take the input for predicting the values for a given input

Order Dates present in the Sales Header table are from 1st July 2005 to 31st July 2008, take the Values 1 to 1127 in the Column header DateDiff (In this case RowID), So in order to get the predictable value for 1st Aug 2008, we have to input 1128 .

Now Click Run on the Bottom panel

After Successful execution Click the “Publish Web Service”
Would you like to publish the web service 'Sample Linear Regression Experiment'?
Click “Yes”
After Publishing the Web Services the Azure ML Application will move to Web Services

Click Test
Enter Differential Date (DIFFDATE) as 1128 and OrderAmt as 0 and Click TICK Option
After Execution, click the Execution Status details picture at the right Corner

Click Details
And you can see the result, it means for the time stamp of 1128 iteration the Model has predicted the result as the predicted Order Amount value

This Post is for a Beginner to run a simple linear regression model with a single input and use the same predict an output

Thank One and All for going through my post

Comments

  • Anonymous
    September 15, 2014
    Great info!! Thanks Kartheek