Share via


Classification Algorithms parameters in Azure ML

Introduction

In Azure machine learning studio when we create a classification ML experiment and when we click on visualize option of evaluate model there are many parameters  metrics information along with charts is displayed to check the accuracy of algorithm. Different types of metrics are available in evaluate model like ROC Graph, Precision, Recall, F1 Score, Lift, TP, TN, FP, FN, AUC, Accuracy of algorithm etc. but to understand each metrics is important to work with machine learning experiments.

In general True positive, true negative, false positive, false negative are pioneer parameters for any algorithms means correctly identified and rejected results.

When we see confusion matrix below: we  can easily calculate accuracy of algorithm by following equations

Precision and Recall

Precision and recall typically used in document retrieval.

Precision: how many of the returned documents are correct and Recall: how many of the positives does the model return

PRECISION = a / (a + c)

RECALL = a / (a + b)

ROC

Receiver Operator Characteristic - Developed in WWII to statistically model false positive and false negative detections of radar operators, and better statistical foundations than most other measures. ROC is standard measure in medicine and biology ROC is becoming more popular in ML also.


Properties of ROC

• ROC Area:

– 1.0: perfect prediction

– 0.9: excellent prediction

– 0.8: good prediction

– 0.7: mediocre prediction

– 0.6: poor prediction

– 0.5: random prediction

– <0.5: something wrong!

ROC Slope is non-increasing each point on ROC represents different tradeoff (cost ratio) between false positives and false negatives, Slope of line tangent to curve defines the cost ratio. ROC Area represents performance averaged over all possible cost ratios If two ROC curves do not intersect, one method dominates the other,  If two ROC curves intersect, one method is better for some cost ratios, and other method is better for other cost ratios.

To calculate Lift following is equation

F1 Score

F1 Score is the harmonic mean of precision and Recall.

F1 = 2TP / (2TP + FP + FN)

Where, TP=True Positive, TN=True Negative, FP=False Positive, FN=False Negative

Threshold - Threshold is the value above which it belongs to first class and all other values to the second class. E.g. if the threshold is 0.5 then any patient scored more than or equal to 0.5 is identified as sick else healthy.

See Also

Another important place to find an extensive amount of Cortana Intelligence Suite related articles is the TechNet Wiki itself. The best entry point is Cortana Intelligence Suite Resources on the TechNet Wiki.