Hello Benedikt Schmitt,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you are having Azure ML label-import COCO-structure issue.
Try to follow the steps below to resolve the issue:
Step 1:
Register Your Data Asset Correctly in Azure ML by ensuring your images are stored in a blob container (e.g., my-container/images/*.jpg
).
When registering the dataset in Azure ML:
- Set the datastore to point to your blob container.
- Set the path to the folder containing your images (e.g.,
images/
). - This defines the "root" directory for your images in Azure ML.
Step 2:
In CVAT, export annotations as COCO. By default, CVAT might use absolute paths or paths relative to its own system. So, modify the COCO file_name
to match the relative path from your Azure ML data asset’s root. For an example: If your data asset is registered at images/
, and your image is in images/folder1/img.jpg
, the file_name
should be folder1/img.jpg
.
Step 3:
This is an example of COCO File for Azure ML
```json
{
"images": [
{
"id": 1,
"file_name": "folder1/image1.jpg", // Relative to Azure ML data asset root
"width": 640,
"height": 480
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"bbox": [x, y, width, height],
"area": (width * height), // Required field; compute if missing
"segmentation": [],
"iscrowd": 0
}
],
"categories": [
{
"id": 1,
"name": "cat"
}
]
}
Step 4:
Before importing, verify that paths in the COCO file match the data asset:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
ml_client = MLClient.from_config(DefaultAzureCredential())
dataset = ml_client.data.get(name="your-data-asset-name", version="1")
# List files in the data asset
files = [f.path for f in dataset.paths]
print(files) # Check if "folder1/image1.jpg" exists here
Step 5:
Use the Azure ML labeling UI to import the COCO file. Ensure you select the correct data asset during import.
Step 6:
For "File not found" Error: Use the SDK code above to confirm the file_name
in the COCO JSON exists in the data asset’s paths
.
For "Empty Labels After Import": Validate the area
field is populated (Azure ML requires it). Compute it as area = bbox[2] * bbox[3]
if missing.
OPTION 2:
Step 1: Ensure that your Azure ML data asset is correctly registered and points to the right storage location. You can use the following command to list the contents of your registered dataset:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import Data
from azure.identity import DefaultAzureCredential
ml_client = MLClient.from_config(DefaultAzureCredential())
# Replace with your actual data asset name
dataset_name = "your-data-asset-name"
dataset_version = "latest"
# Retrieve dataset details
dataset = ml_client.data.get(name=dataset_name, version=dataset_version)
# List all file paths in the dataset
file_paths = [file.path for file in dataset.path]
print(file_paths)
This will help confirm whether the dataset contains the expected image file paths.
Step 2: Azure ML requires that the file_name field in the COCO file must match the relative path within the registered dataset. A properly formatted COCO JSON should look like this:
{
"images": [
{
"id": 1,
"file_name": "images/folder1/image1.jpg", // Ensure this matches your dataset structure
"width": 640,
"height": 480
}
],
"annotations": [
{
"id": 1,
"image_id": 1,
"category_id": 1,
"bbox": [100, 50, 200, 300],
"area": 60000,
"segmentation": [],
"iscrowd": 0
}
],
"categories": [
{
"id": 1,
"name": "cat"
}
]
}
Step 3: Use the following code snippet to check if a specific file path exists in the dataset:
search_file = "images/folder1/image1.jpg" # Adjust based on your dataset
exists = any(search_file in path for path in file_paths)
if exists:
print(f"File {search_file} exists in dataset.")
else:
print(f"File {search_file} NOT found. Check dataset structure.")
This helps verify if your COCO file paths match those in Azure ML.
Step 4: Once the COCO file is structured correctly, then re-import Labels and Validate.
- Go to Azure ML Labeling UI.
- Select the correct dataset (verify that images are loading properly).
- Import the COCO file and check if labels appear correctly.
The below is an official documentation or examples on COCO dataset structure for Azure ML: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-image-labeling
NOTE: If the link brings up error 404, kindly select the specific topic to navigate among the lists.
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.