WebNN API 教學課程

發行項
05/22/2024

如需 WebNN 簡介，包括作業系統支援、模型支援等相關信息，請流覽 WebNN 概觀。

本教學課程將示範如何使用 WebNN API，在使用裝置 GPU 加速的硬體網路上建置影像分類系統。我們將利用MobileNetv2模型，這是用來分類影像的擁抱臉部上開放原始碼模型。

如果您想要檢視並執行本教學課程的最終程序代碼，您可以在我們的 WebNN 開發人員預覽 GitHub 上找到它。

注意

WebNN API 是 W3C 候選建議，且處於開發人員預覽的初期階段。某些功能有限。我們有目前支持和實作狀態的清單。

需求和設定：

設定 Windows

請確定您有正確的Edge、Windows和硬體驅動程式版本，如 WebNN 需求一節中所述。

設定 Edge

下載並安裝 Microsoft Edge Dev。
啟動 Edge Beta，然後在網址列中瀏覽至 about:flags 。
搜尋「WebNN API」，按下拉式清單，並將設定為 [已啟用]。
出現提示時重新啟動Edge。

Edge Beta 中已啟用 WebNN 的影像

設定開發人員環境

下載並安裝 Visual Studio Code （VSCode）。
啟動 VSCode。
在 VSCode 內下載並安裝 VSCode 的 Live Server 擴充功能。
選取 File --> Open Folder，然後在您想要的位置建立空白資料夾。

步驟 1：初始化 Web 應用程式

若要開始，請建立新的 index.html 頁面。將下列未定案程式代碼新增至新頁面：

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>My Website</title>
  </head>
  <body>
    <main>
        <h1>Welcome to My Website</h1>
    </main>
  </body>
</html>

選取 VSCode 右下角的 [上 線] 按鈕，確認未定案程式代碼和開發人員設定可運作。這應該會在執行重複使用程式碼的Edge Beta 中啟動本地伺服器。
現在，建立名為 main.js的新檔案。這會包含應用程式的 javascript 程式代碼。
接下來，從名為 images的根目錄建立子資料夾。下載並儲存資料夾中的任何映像。在此示範中，我們將使用的預設名稱 image.jpg。
從 ONNX 模型動物園下載 mobilenet 模型。在本教學課程中，您將使用 mobilenet2-10.onnx 檔案。將此模型儲存至 Web 應用程式的根資料夾。
最後，下載並儲存此映像類別檔案。 imagenetClasses.js 這會為您的模型提供1000個常見的影像分類。

步驟 2：新增 UI 元素和父函式

在您在上一個步驟中新增的 <main> html 標記主體內，以下列元素取代現有的程序代碼。這些會建立按鈕並顯示預設影像。

<h1>Image Classification Demo!</h1> 
<div><img src="./images/image.jpg"></div> 
<button onclick="classifyImage('./images/image.jpg')"  type="button">Click Me to Classify Image!</button> 
<h1 id="outputText"> This image displayed is ... </h1>

現在，您會將 ONNX Runtime Web 新增至頁面，這是您將用來存取 WebNN API 的 JavaScript 連結庫。在 html 標籤的主體內 <head> ，新增下列 javascript 來源連結。

<script src="./main.js"></script> 
<script src="imagenetClasses.js"></script>
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web@1.18.0-dev.20240311-5479124834/dist/ort.webgpu.min.js"></script>

開啟您的 main.js 檔案，並新增下列代碼段。

async function classifyImage(pathToImage){ 
  var imageTensor = await getImageTensorFromPath(pathToImage); // Convert image to a tensor
  var predictions = await runModel(imageTensor); // Run inference on the tensor
  console.log(predictions); // Print predictions to console
  document.getElementById("outputText").innerHTML += predictions[0].name; // Display prediction in HTML
}

步驟 3：前置處理數據

您剛才新增的函式會呼叫 getImageTensorFromPath，這是您必須實作的另一個函式。您將在下方新增它，以及它呼叫的另一個異步函式，以擷取映射本身。

  async function getImageTensorFromPath(path, width = 224, height = 224) {
    var image = await loadImagefromPath(path, width, height); // 1. load the image
    var imageTensor = imageDataToTensor(image); // 2. convert to tensor
    return imageTensor; // 3. return the tensor
  } 

  async function loadImagefromPath(path, resizedWidth, resizedHeight) {
    var imageData = await Jimp.read(path).then(imageBuffer => { // Use Jimp to load the image and resize it.
      return imageBuffer.resize(resizedWidth, resizedHeight);
    });

    return imageData.bitmap;
  }

您也需要新增 imageDataToTensor 上述參考的函式，其會將載入的影像轉譯成會使用 ONNX 模型的張量格式。這是一個更相關的函式，但如果您以前曾使用過類似的影像分類應用程式，則看起來可能很熟悉。如需擴充說明，您可以檢視此 ONNX 教學課程。

  function imageDataToTensor(image) {
    var imageBufferData = image.data;
    let pixelCount = image.width * image.height;
    const float32Data = new Float32Array(3 * pixelCount); // Allocate enough space for red/green/blue channels.

    // Loop through the image buffer, extracting the (R, G, B) channels, rearranging from
    // packed channels to planar channels, and converting to floating point.
    for (let i = 0; i < pixelCount; i++) {
      float32Data[pixelCount * 0 + i] = imageBufferData[i * 4 + 0] / 255.0; // Red
      float32Data[pixelCount * 1 + i] = imageBufferData[i * 4 + 1] / 255.0; // Green
      float32Data[pixelCount * 2 + i] = imageBufferData[i * 4 + 2] / 255.0; // Blue
      // Skip the unused alpha channel: imageBufferData[i * 4 + 3].
    }
    let dimensions = [1, 3, image.height, image.width];
    const inputTensor = new ort.Tensor("float32", float32Data, dimensions);
    return inputTensor;
  }

步驟 4：呼叫 WebNN

您現在已新增擷取影像所需的所有函式，並將其轉譯為張量。現在，使用您上面載入的 ONNX 執行時間 Web 連結庫，您將執行模型。請注意，若要在這裡使用 WebNN，您只需指定 executionProvider = "webnn" - ONNX Runtime 的支援可讓您非常直接地啟用 WebNN。

  async function runModel(preprocessedData) { 
    // Set up environment.
    ort.env.wasm.numThreads = 1; 
    ort.env.wasm.simd = true; 
    ort.env.wasm.proxy = true; 
    ort.env.logLevel = "verbose";  
    ort.env.debug = true; 

    // Configure WebNN.
    const executionProvider = "webnn"; // Other options: webgpu 
    const modelPath = "./mobilenetv2-7.onnx" 
    const options = {
	    executionProviders: [{ name: executionProvider, deviceType: "gpu", powerPreference: "default" }],
      freeDimensionOverrides: {"batch": 1, "channels": 3, "height": 224, "width": 224}
    };
    modelSession = await ort.InferenceSession.create(modelPath, options); 

    // Create feeds with the input name from model export and the preprocessed data. 
    const feeds = {}; 
    feeds[modelSession.inputNames[0]] = preprocessedData; 
    // Run the session inference.
    const outputData = await modelSession.run(feeds); 
    // Get output results with the output name from the model export. 
    const output = outputData[modelSession.outputNames[0]]; 
    // Get the softmax of the output data. The softmax transforms values to be between 0 and 1.
    var outputSoftmax = softmax(Array.prototype.slice.call(output.data)); 
    // Get the top 5 results.
    var results = imagenetClassesTopK(outputSoftmax, 5);

    return results; 
  }

步驟 5：後續處理數據

最後，您將新增函 softmax 式，然後新增最終函式以傳回最有可能的影像分類。會 softmax 轉換您的值介於 0 到 1 之間，這是這個最終分類所需的機率形式。

首先，在的main.js前端標記中新增協助程序連結庫 Jimp 和 Lodash 的下列原始程序檔。

<script src="https://cdnjs.cloudflare.com/ajax/libs/jimp/0.22.12/jimp.min.js" integrity="sha512-8xrUum7qKj8xbiUrOzDEJL5uLjpSIMxVevAM5pvBroaxJnxJGFsKaohQPmlzQP8rEoAxrAujWttTnx3AMgGIww==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<script src="https://cdn.jsdelivr.net/npm/lodash@4.17.21/lodash.min.js"></script>

現在，將這些下列函式新增至 main.js。

// The softmax transforms values to be between 0 and 1.
function softmax(resultArray) {
  // Get the largest value in the array.
  const largestNumber = Math.max(...resultArray);
  // Apply the exponential function to each result item subtracted by the largest number, using reduction to get the
  // previous result number and the current number to sum all the exponentials results.
  const sumOfExp = resultArray 
    .map(resultItem => Math.exp(resultItem - largestNumber)) 
    .reduce((prevNumber, currentNumber) => prevNumber + currentNumber);

  // Normalize the resultArray by dividing by the sum of all exponentials.
  // This normalization ensures that the sum of the components of the output vector is 1.
  return resultArray.map((resultValue, index) => {
    return Math.exp(resultValue - largestNumber) / sumOfExp
  });
}

function imagenetClassesTopK(classProbabilities, k = 5) { 
  const probs = _.isTypedArray(classProbabilities)
    ? Array.prototype.slice.call(classProbabilities)
    : classProbabilities;

  const sorted = _.reverse(
    _.sortBy(
      probs.map((prob, index) => [prob, index]),
      probIndex => probIndex[0]
    )
  );

  const topK = _.take(sorted, k).map(probIndex => {
    const iClass = imagenetClasses[probIndex[1]]
    return {
      id: iClass[0],
      index: parseInt(probIndex[1].toString(), 10),
      name: iClass[1].replace(/_/g, " "),
      probability: probIndex[0]
    }
  });
  return topK;
}

您現在已新增在基本 Web 應用程式中使用 WebNN 執行影像分類所需的所有腳本。使用 VS Code 的 Live Server 擴充功能，您現在可以啟動應用程式內的基本網頁，以查看您自己分類的結果。

共用方式為