AveragedPerceptronBinaryClassifier Class
Machine Learning Averaged Perceptron Binary Classifier
- Inheritance
-
nimbusml.internal.core.linear_model._averagedperceptronbinaryclassifier.AveragedPerceptronBinaryClassifierAveragedPerceptronBinaryClassifiernimbusml.base_predictor.BasePredictorAveragedPerceptronBinaryClassifiersklearn.base.ClassifierMixinAveragedPerceptronBinaryClassifier
Constructor
AveragedPerceptronBinaryClassifier(normalize='Auto', caching='Auto', loss='hinge', learning_rate=1.0, decrease_learning_rate=False, l2_regularization=0.0, number_of_iterations=1, initial_weights_diameter=0.0, reset_weights_after_x_examples=None, lazy_update=True, recency_gain=0.0, recency_gain_multiplicative=False, averaged=True, averaged_tolerance=0.01, initial_weights=None, shuffle=True, feature=None, label=None, **params)
Parameters
Name | Description |
---|---|
feature
|
see Columns. |
label
|
see Columns. |
normalize
|
Specifies the type of automatic normalization used:
Normalization rescales disparate data ranges to a standard scale.
Feature
scaling insures the distances between data points are proportional
and
enables various optimization methods such as gradient descent to
converge
much faster. If normalization is performed, a |
caching
|
Whether trainer should cache input training data. |
loss
|
The default is Hinge. Other choices are Exp, Log, and SmoothedHinge. For more information, please see the documentation page about losses, Loss. |
learning_rate
|
Determines the size of the step taken in the direction of the gradient in each step of the learning process. This determines how fast or slow the learner converges on the optimal solution. If the step size is too big, you might overshoot the optimal solution. If the step size is too small, training takes longer to converge to the best solution. |
decrease_learning_rate
|
Decrease learning rate. |
l2_regularization
|
L2 Regularization Weight. |
number_of_iterations
|
Number of iterations. |
initial_weights_diameter
|
Sets the initial weights diameter that
specifies the range from which values are drawn for the initial
weights. These weights are initialized randomly from within this range.
For example, if the diameter is specified to be |
reset_weights_after_x_examples
|
Number of examples after which weights will be reset to the current average. |
lazy_update
|
Instead of updating averaged weights on every example, only update when loss is nonzero. |
recency_gain
|
Extra weight given to more recent updates. |
recency_gain_multiplicative
|
Whether Recency Gain is multiplicative (vs. additive). |
averaged
|
Do averaging?. |
averaged_tolerance
|
The inexactness tolerance for averaging. |
initial_weights
|
Initial Weights and bias, comma-separated. |
shuffle
|
Whether to shuffle for each training iteration. |
params
|
Additional arguments sent to compute engine. |
Examples
###############################################################################
# AveragedPerceptronBinaryClassifier
from nimbusml import Pipeline, FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.linear_model import AveragedPerceptronBinaryClassifier
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path)
print(data.head())
# age case education induced parity ... row_num spontaneous ...
# 0 26 1 0-5yrs 1 6 ... 1 2 ...
# 1 42 1 0-5yrs 1 1 ... 2 0 ...
# 2 39 1 0-5yrs 2 6 ... 3 0 ...
# 3 34 1 0-5yrs 2 4 ... 4 0 ...
# 4 35 1 6-11yrs 1 3 ... 5 1 ...
# define the training pipeline
pipeline = Pipeline([AveragedPerceptronBinaryClassifier(
feature=['age', 'parity', 'spontaneous'], label='case')])
# train, predict, and evaluate
metrics, predictions = pipeline.fit(data).test(data, output_scores=True)
# print predictions
print(predictions.head())
# PredictedLabel Score
# 0 0 -0.285667
# 1 0 -1.304729
# 2 0 -2.651955
# 3 0 -2.111450
# 4 0 -0.660658
# print evaluation metrics
print(metrics)
# AUC Accuracy Positive precision Positive recall ...
# 0 0.705038 0.71371 0.7 0.253012 ...
Remarks
Perceptron is a classification algorithm that makes its predictions based on a linear function. I.e., for an instance with feature values f0, f1,..., f_D-1, , the prediction is given by the sign of sigma[0,D-1] ( w_i * f_i), where w_0, w_1,...,w_D-1 are the weights computed by the algorithm.
Perceptron is an online algorithm, i.e., it processes the instances in the training set one at a time. The weights are initialized to be 0, or some random values. Then, for each example in the training set, the value of sigma[0, D-1] (w_i * f_i) is computed. If this value has the same sign as the label of the current example, the weights remain the same. If they have opposite signs, the weights vector is updated by either subtracting or adding (if the label is negative or positive, respectively) the feature vector of the current example, multiplied by a factor 0 < a <= 1, called the learning rate. In a generalization of this algorithm, the weights are updated by adding the feature vector multiplied by the learning rate, and by the gradient of some loss function (in the specific case described above, the loss is hinge- loss, whose gradient is 1 when it is non-zero).
In Averaged Perceptron (AKA voted-perceptron), the weight vectors are stored, together with a weight that counts the number of iterations it survived (this is equivalent to storing the weight vector after every iteration, regardless of whether it was updated or not). The prediction is then calculated by taking the weighted average of all the sums sigma[0, D-1] (w_i * f_i) or the different weight vectors.
Reference
Wikipedia entry for Perceptron
Large Margin Classification Using the Perceptron Algorithm
Discriminative Training Methods for Hidden Markov Models
Methods
decision_function |
Returns score values |
get_params |
Get the parameters for this operator. |
predict_proba |
Returns probabilities |
decision_function
Returns score values
decision_function(X, **params)
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
Name | Description |
---|---|
deep
|
Default value: False
|
predict_proba
Returns probabilities
predict_proba(X, **params)