教程：创建 Windows 机器学习桌面应用程序 (C++)

项目
07/11/2023

利用 Windows ML API，可以轻松地与 C++ 桌面 (Win32) 应用程序中的机器学习模型进行交互。使用加载、绑定和评估这三个步骤，你的应用程序将可以利用机器学习功能。

加载 -> 绑定 -> 评估

我们将创建 GitHub 上提供的“SqueezeNet 对象检测”示例的简化版本。如果希望查看它完成后的样子，你可以下载完整的示例。

我们将使用 C++/WinRT 来访问 WinML API。有关详细信息，请参阅 C++/WinRT。

本教程介绍以下操作：

加载机器学习模型
将图像加载为 VideoFrame
绑定模型的输入和输出
评估模型并打印有意义的结果

先决条件

Visual Studio 2019（或 Visual Studio 2017 版本 15.7.4 或更高版本）
Windows 10 版本 1809 或更高版本
Windows SDK 版本 17763 或更高版本
适用于 C++/WinRT 的 Visual Studio 扩展
1. 在 Visual Studio 中，选择“工具”>“扩展和更新”。
2. 在左窗格中选择“联机”，并使用右侧的搜索框搜索“WinRT”。
3. 选择“C++/WinRT”，单击“下载”并关闭 Visual Studio。
4. 按照安装说明进行操作，然后重新打开 Visual Studio。
Windows-Machine-Learning Github 存储库（你可以将其下载为 ZIP 文件或克隆到你的计算机）

创建项目

首先，我们将在 Visual Studio 中创建项目：

选择“文件”>“新建”>“项目”以打开“新建项目”窗口。
在左窗格中，选择“已安装”>“Visual C++”>“Windows 桌面”，在中间，选择“Windows 控制台应用程序(C++/WinRT)”。
为你的项目指定名称和位置，然后单击“确定”。
在“新建通用 Windows 平台项目”窗口中，将“目标”和“最低版本”都设置为版本 17763 或更高版本，然后单击“确定”。
根据你的计算机的体系结构，确保在顶部工具栏中将下拉菜单设置为“调试”和“x64”或“x86”。
按 Ctrl + F5 运行程序而不进行调试。应当会打开一个终端，其中显示“Hello world”文本。按任意键将其关闭。

加载模型

接下来，我们将使用 LearningModel.LoadFromFilePath 将 ONNX 模型加载到程序中：

在 pch.h（位于 Header Files 文件夹中）中，添加以下 include 语句（这允许我们访问所需的所有 API）：

#include <winrt/Windows.AI.MachineLearning.h>
#include <winrt/Windows.Foundation.Collections.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Media.h>
#include <winrt/Windows.Storage.h>

#include <string>
#include <fstream>

#include <Windows.h>

在 main.cpp（位于 Source Files 文件夹中）中，添加以下 using 语句：

using namespace Windows::AI::MachineLearning;
using namespace Windows::Foundation::Collections;
using namespace Windows::Graphics::Imaging;
using namespace Windows::Media;
using namespace Windows::Storage;

using namespace std;

将以下变量声明添加到 using 语句后：

// Global variables
hstring modelPath;
string deviceName = "default";
hstring imagePath;
LearningModel model = nullptr;
LearningModelDeviceKind deviceKind = LearningModelDeviceKind::Default;
LearningModelSession session = nullptr;
LearningModelBinding binding = nullptr;
VideoFrame imageFrame = nullptr;
string labelsFilePath;
vector<string> labels;

在全局变量后面添加以下前向声明：

// Forward declarations
void LoadModel();
VideoFrame LoadImageFile(hstring filePath);
void BindModel();
void EvaluateModel();
void PrintResults(IVectorView<float> results);
void LoadLabels();

在 main.cpp 中，删除“Hello world”代码（main 函数中 init_apartment 后面的所有内容）。
在 Windows-Machine-Learning 存储库的本地克隆中查找 SqueezeNet.onnx 文件。它应当位于 \Windows-Machine-Learning\SharedContent\models。
复制文件路径，并将其指定给我们在顶部定义的 modelPath 变量。记得要使用 L 作为字符串前缀，使其成为宽字符串，以便它能够正常用于 hstring，并使用额外的反斜杠来对任何反斜杠 (\) 进行转义。例如：
```
hstring modelPath = L"C:\\Repos\\Windows-Machine-Learning\\SharedContent\\models\\SqueezeNet.onnx";
```

首先，我们将实现 LoadModel 方法。在 main 方法后面添加以下方法。此方法加载模型并输出它所用时间：

void LoadModel()
{
     // load the model
     printf("Loading modelfile '%ws' on the '%s' device\n", modelPath.c_str(), deviceName.c_str());
     DWORD ticks = GetTickCount();
     model = LearningModel::LoadFromFilePath(modelPath);
     ticks = GetTickCount() - ticks;
     printf("model file loaded in %d ticks\n", ticks);
}

最后，从 main 方法调用此方法：
```
LoadModel();
```
运行程序而不进行调试。你应会看到模型已成功加载！

加载图像

接下来，我们将图像文件加载到程序中：

添加以下方法。此方法将从给定路径加载图像，并基于该图像创建一个 VideoFrame：

VideoFrame LoadImageFile(hstring filePath)
{
    printf("Loading the image...\n");
    DWORD ticks = GetTickCount();
    VideoFrame inputImage = nullptr;

    try
    {
        // open the file
        StorageFile file = StorageFile::GetFileFromPathAsync(filePath).get();
        // get a stream on it
        auto stream = file.OpenAsync(FileAccessMode::Read).get();
        // Create the decoder from the stream
        BitmapDecoder decoder = BitmapDecoder::CreateAsync(stream).get();
        // get the bitmap
        SoftwareBitmap softwareBitmap = decoder.GetSoftwareBitmapAsync().get();
        // load a videoframe from it
        inputImage = VideoFrame::CreateWithSoftwareBitmap(softwareBitmap);
    }
    catch (...)
    {
        printf("failed to load the image file, make sure you are using fully qualified paths\r\n");
        exit(EXIT_FAILURE);
    }

    ticks = GetTickCount() - ticks;
    printf("image file loaded in %d ticks\n", ticks);
    // all done
    return inputImage;
}

在 main 方法中添加对此方法的调用：
```
imageFrame = LoadImageFile(imagePath);
```
在 Windows-Machine-Learning 存储库的本地克隆中查找 media 文件夹。它应当位于 \Windows-Machine-Learning\SharedContent\media。
选择该文件夹中的一个图像，并将其文件路径指定给我们在顶部定义的 imagePath 变量。记得要使用 L 作为其前缀，使其成为宽字符串，并使用另一个反斜杠来对任何反斜杠进行转义。例如：
```
hstring imagePath = L"C:\\Repos\\Windows-Machine-Learning\\SharedContent\\media\\kitten_224.png";
```
运行程序而不进行调试。你应当会看到图像已成功加载！

绑定输入和输出

接下来，我们将基于模型创建会话，并使用 LearningModelBinding.Bind 从会话绑定输入和输出。有关绑定的详细信息，请参阅绑定模型。

实现 BindModel 方法。这会基于模型和设备创建一个会话，并基于该会话创建一个绑定。然后，将输入和输出绑定到我们使用其名称创建的变量。我们事先知道输入特征名为“data_0”，输出特征名为“softmaxout_1”。可以通过在 Netron（一个在线模型可视化工具）中打开模型来查看任何模型的这些属性。

void BindModel()
{
    printf("Binding the model...\n");
    DWORD ticks = GetTickCount();

    // now create a session and binding
    session = LearningModelSession{ model, LearningModelDevice(deviceKind) };
    binding = LearningModelBinding{ session };
    // bind the intput image
    binding.Bind(L"data_0", ImageFeatureValue::CreateFromVideoFrame(imageFrame));
    // bind the output
    vector<int64_t> shape({ 1, 1000, 1, 1 });
    binding.Bind(L"softmaxout_1", TensorFloat::Create(shape));

    ticks = GetTickCount() - ticks;
    printf("Model bound in %d ticks\n", ticks);
}

从 main 方法中添加对 BindModel 的调用：
```
BindModel();
```
运行程序而不进行调试。模型的输入和输出应当已成功绑定。即将大功完成！

评估模型

现在，我们来到了本教程开头的图示中的最后一步：评估。我们将使用 LearningModelSession.Evaluate 来评估模型：

实现 EvaluateModel 方法。此方法获取我们的会话，并使用我们的绑定和相关 ID 对其进行评估。相关 ID 就是我们之后可能会用来将特定评估调用匹配到输出结果的东西。同样，我们事先知道输出的名称为“softmaxout_1”。

void EvaluateModel()
{
    // now run the model
    printf("Running the model...\n");
    DWORD ticks = GetTickCount();

    auto results = session.Evaluate(binding, L"RunId");

    ticks = GetTickCount() - ticks;
    printf("model run took %d ticks\n", ticks);

    // get the output
    auto resultTensor = results.Outputs().Lookup(L"softmaxout_1").as<TensorFloat>();
    auto resultVector = resultTensor.GetAsVectorView();
    PrintResults(resultVector);
}

现在，让我们实现 PrintResults。此方法获取图像中可能存在的对象的前三大概率，并打印它们：

void PrintResults(IVectorView<float> results)
{
    // load the labels
    LoadLabels();
    // Find the top 3 probabilities
    vector<float> topProbabilities(3);
    vector<int> topProbabilityLabelIndexes(3);
    // SqueezeNet returns a list of 1000 options, with probabilities for each, loop through all
    for (uint32_t i = 0; i < results.Size(); i++)
    {
        // is it one of the top 3?
        for (int j = 0; j < 3; j++)
        {
            if (results.GetAt(i) > topProbabilities[j])
            {
                topProbabilityLabelIndexes[j] = i;
                topProbabilities[j] = results.GetAt(i);
                break;
            }
        }
    }
    // Display the result
    for (int i = 0; i < 3; i++)
    {
        printf("%s with confidence of %f\n", labels[topProbabilityLabelIndexes[i]].c_str(), topProbabilities[i]);
    }
}

我们还需要实现 LoadLabels。此方法打开标签文件（其中包含模型可以识别的所有不同对象）并对其进行解析：

void LoadLabels()
{
    // Parse labels from labels file.  We know the file's entries are already sorted in order.
    ifstream labelFile{ labelsFilePath, ifstream::in };
    if (labelFile.fail())
    {
        printf("failed to load the %s file.  Make sure it exists in the same folder as the app\r\n", labelsFilePath.c_str());
        exit(EXIT_FAILURE);
    }

    std::string s;
    while (std::getline(labelFile, s, ','))
    {
        int labelValue = atoi(s.c_str());
        if (labelValue >= labels.size())
        {
            labels.resize(labelValue + 1);
        }
        std::getline(labelFile, s);
        labels[labelValue] = s;
    }
}

在 Windows-Machine-Learning 存储库的本地克隆中找到 Labels.txt 文件。它应当位于 \Windows-Machine-Learning\Samples\SqueezeNetObjectDetection\Desktop\cpp。
将此文件路径指定给我们在顶部定义的 labelsFilePath 变量。请确保用另一反斜杠来对任何反斜杠进行转义。例如：
```
string labelsFilePath = "C:\\Repos\\Windows-Machine-Learning\\Samples\\SqueezeNetObjectDetection\\Desktop\\cpp\\Labels.txt";
```
在 main 方法中添加对 EvaluateModel 的调用：
```
EvaluateModel();
```

运行程序而不进行调试。它现在应能正确识别图像中的内容！下面是它可能会输出的内容的示例：

Loading modelfile 'C:\Repos\Windows-Machine-Learning\SharedContent\models\SqueezeNet.onnx' on the 'default' device
model file loaded in 250 ticks
Loading the image...
image file loaded in 78 ticks
Binding the model...Model bound in 15 ticks
Running the model...
model run took 16 ticks
tabby, tabby cat with confidence of 0.931461
Egyptian cat with confidence of 0.065307
Persian cat with confidence of 0.000193

后续步骤

太好了，你已经使对象检测在 C++ 桌面应用程序中工作了！接下来，你可以尝试使用命令行参数来输入模型和图像文件，而不是将其硬编码，这类似于 GitHub 上的示例。你还可以尝试在不同的设备（例如 GPU）上运行评估，以查看性能如何变化。

琢磨 GitHub 上的其他示例并任意扩展它们！

另请参阅

注意

使用以下资源可获取有关 Windows ML 的帮助：

若要提出或回答有关 Windows ML 的技术问题，请在 Stack Overflow 上使用 windows-machine-learning 标记。
若要报告 bug，请在 GitHub 上提交问题。

通过