Loading, Saving and Serving Models

Persisting Models

Trainers, transforms and pipelines can be persisted in a couple of ways. Using Python’s built-in persistence model of pickle, or else by using the the load_model() and save_model() methods of nimbusml.Pipeline.

Advantages of using pickle is that all attribute values of objects are preserved, and can be inspected after deserialization. However, for models trained from external sources such as the ML.NET C# application, pickle cannot be used, and the load_model() method needs to be used instead. Similarly the save_model() method saves the model in a format that can be used by external applications.

Using Pickle

Below is an example using pickle.

Example
import pickle
from nimbusml import Pipeline, FileDataStream
from nimbusml.linear_model import AveragedPerceptronBinaryClassifier
from nimbusml.datasets import get_dataset

data_file = get_dataset('infert').as_filepath()

ds = FileDataStream.read_csv(data_file)
ds.schema.rename('case', 'case2') # column name case is not allowed in C#
# Train a model and score
pipeline = Pipeline([AveragedPerceptronBinaryClassifier(
    feature=['age', 'parity', 'spontaneous'], label='case2')])

metrics, scores = pipeline.fit(ds).test(ds, output_scores=True)
print(metrics)

# Load model from file and evaluate. Note that 'evaltype'
# must be specified explicitly
s = pickle.dumps(pipeline)
pipe2 = pickle.loads(s)
metrics2, scores2 = pipe2.test(ds, evaltype='binary', output_scores=True)
print(metrics2)

Output:

Automatically adding a MinMax normalization transform, use 'norm=Warn' or 'norm=No' to turn this behavior off.
Training calibrator.
Elapsed time: 00:00:00.5800875
        AUC  Accuracy  Positive precision  Positive recall  Negative precision  Negative recall  Log-loss  Log-loss reduction  Test-set entropy (prior Log-Loss/instance)  F1 Score     AUPRC
0  0.705038   0.71371                 0.7         0.253012            0.715596         0.945455  0.814956            0.113826                                    0.919634  0.371681  0.572031
        AUC  Accuracy  Positive precision  Positive recall  Negative precision  Negative recall  Log-loss  Log-loss reduction  Test-set entropy (prior Log-Loss/instance)  F1 Score     AUPRC
0  0.705038   0.71371                 0.7         0.253012            0.715596         0.945455  0.814956            0.113826                                    0.919634  0.371681  0.572031

Using load_model() and save_model()

Below is an example of using load_model() and save_model(). The model can also originate from external tools such as the ML.NET C# application or Maml.exe command line tool. When loading a model this way, the argument of ‘evaltype’ must be specified explicitly.

Example
from nimbusml import Pipeline, FileDataStream
from nimbusml.linear_model import AveragedPerceptronBinaryClassifier
from nimbusml.datasets import get_dataset

data_file = get_dataset('infert').as_filepath()
ds = FileDataStream.read_csv(data_file)
ds.schema.rename('case', 'case2') # column name case is not allowed in C#

# Train a model and score
pipeline = Pipeline([AveragedPerceptronBinaryClassifier(
    feature=['age', 'parity', 'spontaneous'], label='case2')])

metrics, scores = pipeline.fit(ds).test(ds, output_scores=True)
pipeline.save_model("mymodeluci.zip")
print(metrics)

# Load model from file and evaluate. Note that 'evaltype'
# must be specified explicitly
pipeline2 = Pipeline()
pipeline2.load_model("mymodeluci.zip")
metrics2, scores2 = pipeline2.test(ds, y = 'case2', evaltype='binary')
print(metrics2)

Output:

Automatically adding a MinMax normalization transform, use 'norm=Warn' or 'norm=No' to turn this behavior off.
Training calibrator.
Elapsed time: 00:00:00.1367380
        AUC  Accuracy  Positive precision  Positive recall  Negative precision  Negative recall  Log-loss  Log-loss reduction  Test-set entropy (prior Log-Loss/instance)  F1 Score     AUPRC
0  0.705038   0.71371                 0.7         0.253012            0.715596         0.945455  0.814956            0.113826                                    0.919634  0.371681  0.572031
        AUC  Accuracy  Positive precision  Positive recall  Negative precision  Negative recall  Log-loss  Log-loss reduction  Test-set entropy (prior Log-Loss/instance)  F1 Score     AUPRC
0  0.705038   0.71371                 0.7         0.253012            0.715596         0.945455  0.814956            0.113826                                    0.919634  0.371681  0.572031

Scoring in ML.NET

The saved model (‘mymodeluci.zip’) can be used for scoring in ML.NET using the following code:

public static void Score()
{
    var modelPath = "mymodeluci.zip";
    var mlContext = new MLContext();
    var loadedModel = mlContext.Model.Load(modelPath, out DataViewSchema inputSchema);

    var example = new List<InfertData>()
    {
        new InfertData()
        {
            age = 26,
            parity = 6,
            spontaneous = 2
        }
    };
    // load data into IDataView
    var loadedData = mlContext.Data.LoadFromEnumerable(example);

    var predictionDataView = loadedModel.Transform(loadedData);
    // convert IDataView predictions to IEnumerable
    var prediction = mlContext.Data
        .CreateEnumerable<InfertPrediction>(predictionDataView,
        reuseRowObject: false).ToList();

    foreach (var p in prediction)
    {
        Console.WriteLine($"PredictedLabel: {p.PredictedLabel}, " +
        $"Probability: {p.Probability}, Score: {p.Score}");
    }
}

public class InfertData
{
    public int age { get; set; }

    public int parity { get; set; }

    public int spontaneous { get; set; }
}

public class InfertPrediction
{
    public bool PredictedLabel { get; set; }

    public float Probability { get; set; }

    public float Score { get; set; }
}