자동화된 Machine Learning으로 Computer Vision 모델을 학습하기 위한 데이터 스키마(v1)

아티클
09/01/2024

Important

이 문서의 일부 Azure CLI 명령에서는 azure-cli-ml 또는 v1(Azure Machine Learning용 확장)을 사용합니다. v1 확장에 대한 지원은 2025년 9월 30일에 종료됩니다. v1 확장은 이 날짜까지 설치하고 사용할 수 있습니다.

2025년 9월 30일 이전에 ml 또는 v2 확장으로 전환하는 것이 좋습니다. v2 확장에 대한 자세한 내용은 Azure ML CLI 확장 및 Python SDK v2를 참조하세요.

Important

이 기능은 현재 공개 미리 보기로 제공됩니다. 이 미리 보기 버전은 서비스 수준 계약 없이 제공됩니다. 특정 기능이 지원되지 않거나 기능이 제한될 수 있습니다. 자세한 내용은 Microsoft Azure Preview에 대한 추가 사용 약관을 참조하세요.

학습 및 유추 중 Computer Vision 작업에 대한 자동화된 ML 실험에서 데이터 사용을 위해 JSONL 파일 형식을 지정하는 방법을 알아봅니다.

학습용 데이터 스키마

이미지용 Azure Machine Learning AutoML을 사용하려면 입력 이미지 데이터를 JSONL(JSON Line) 형식으로 준비해야 합니다. 이 섹션에서는 이미지 분류 다중 클래스, 이미지 분류 다중 레이블, 개체 감지 및 인스턴스 분할을 위한 입력 데이터 서식 또는 스키마에 대해 설명합니다. 또한 최종 학습 또는 유효성 검사 JSON Line 파일의 샘플을 제공합니다.

이미지 분류(이진/다중 클래스)

각 JSON Line의 데이터 서식/스키마 입력:

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":"class_name",
}

키	설명	예시
`image_url`	Azure Machine Learning 데이터 저장소의 이미지 위치 `Required, String`	`"AmlDatastore://data_directory/Image_01.jpg"`
`image_details`	이미지 세부 정보 `Optional, Dictionary`	`"image_details":{"format": "jpg", "width": "400px", "height": "258px"}`
`format`	이미지 유형(Pillow 라이브러리에서 사용 가능한 모든 이미지 형식이 지원됨) `Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif","bmp", "tif", "tiff"}`	`"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"`
`width`	이미지의 너비 `Optional, String or Positive Integer`	`"400px" or 400`
`height`	이미지의 높이 `Optional, String or Positive Integer`	`"200px" or 200`
`label`	이미지의 클래스/레이블 `Required, String`	`"cat"`

다중 클래스 이미지 분류를 위한 JSONL 파일의 예:

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details":{"format": "jpg", "width": "400px", "height": "258px"}, "label": "can"}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "397px", "height": "296px"}, "label": "milk_bottle"}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "1024px", "height": "768px"}, "label": "water_bottle"}

이미지 분류 다중 클래스에 대한 이미지 예.

이미지 분류 다중 레이블

다음은 이미지 분류를 위한 JSON Line별 입력 데이터 서식/스키마의 예입니다.

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      "class_name_1",
      "class_name_2",
      "class_name_3",
      "...",
      "class_name_n"
        
   ]
}

키	설명	예시
`image_url`	Azure Machine Learning 데이터 저장소의 이미지 위치 `Required, String`	`"AmlDatastore://data_directory/Image_01.jpg"`
`image_details`	이미지 세부 정보 `Optional, Dictionary`	`"image_details":{"format": "jpg", "width": "400px", "height": "258px"}`
`format`	이미지 유형(Pillow 라이브러리에서 사용 가능한 모든 이미지 형식이 지원됨) `Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff"}`	`"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"`
`width`	이미지의 너비 `Optional, String or Positive Integer`	`"400px" or 400`
`height`	이미지의 높이 `Optional, String or Positive Integer`	`"200px" or 200`
`label`	이미지의 클래스/레이블 목록 `Required, List of Strings`	`["cat","dog"]`

이미지 분류 다중 레이블에 대한 JSONL 파일의 예:

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details":{"format": "jpg", "width": "400px", "height": "258px"}, "label": ["can"]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "397px", "height": "296px"}, "label": ["can","milk_bottle"]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "1024px", "height": "768px"}, "label": ["carton","milk_bottle","water_bottle"]}

이미지 분류 다중 레이블에 대한 이미지 예.

개체 감지

다음은 개체 감지를 위한 JSONL 파일의 예입니다.

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      {
         "label":"class_name_1",
         "topX":"xmin/width",
         "topY":"ymin/height",
         "bottomX":"xmax/width",
         "bottomY":"ymax/height",
         "isCrowd":"isCrowd"
      },
      {
         "label":"class_name_2",
         "topX":"xmin/width",
         "topY":"ymin/height",
         "bottomX":"xmax/width",
         "bottomY":"ymax/height",
         "isCrowd":"isCrowd"
      },
      "..."
   ]
}

여기서,

xmin = 경계 상자의 왼쪽 상단 모서리의 x 좌표
ymin = 경계 상자의 왼쪽 상단 모서리의 y 좌표
xmax = 경계 상자의 오른쪽 하단 모서리의 x 좌표
ymax = 경계 상자의 오른쪽 하단 모서리의 y 좌표

키	설명	예시
`image_url`	Azure Machine Learning 데이터 저장소의 이미지 위치 `Required, String`	`"AmlDatastore://data_directory/Image_01.jpg"`
`image_details`	이미지 세부 정보 `Optional, Dictionary`	`"image_details":{"format": "jpg", "width": "400px", "height": "258px"}`
`format`	이미지 유형(Pillow 라이브러리에서 사용할 수 있는 모든 이미지 형식이 지원됩니다. 그러나 YOLO의 경우 opencv 에서 허용하는 이미지 형식만 지원됨) `Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff"}`	`"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"`
`width`	이미지의 너비 `Optional, String or Positive Integer`	`"499px" or 499`
`height`	이미지의 높이 `Optional, String or Positive Integer`	`"665px" or 665`
`label`(외부 키)	각 상자가 왼쪽 위 및 오른쪽 아래 좌표의 `label, topX, topY, bottomX, bottomY, isCrowd` 사전인 경계 상자 목록 `Required, List of dictionaries`	`[{"label": "cat", "topX": 0.260, "topY": 0.406, "bottomX": 0.735, "bottomY": 0.701, "isCrowd": 0}]`
`label`(내부 키)	경계 상자에 있는 개체의 클래스/레이블 `Required, String`	`"cat"`
`topX`	경계 상자의 왼쪽 위 모서리의 x 좌표와 이미지 너비의 비율 `Required, Float in the range [0,1]`	`0.260`
`topY`	경계 상자의 왼쪽 위 모서리의 y 좌표와 이미지 높이의 비율 `Required, Float in the range [0,1]`	`0.406`
`bottomX`	경계 상자의 오른쪽 하단 모서리의 x 좌표와 이미지 너비의 비율 `Required, Float in the range [0,1]`	`0.735`
`bottomY`	경계 상자의 오른쪽 하단 모서리의 y 좌표와 이미지 높이의 비율 `Required, Float in the range [0,1]`	`0.701`
`isCrowd`	경계 상자가 개체의 종류 주위에 있는지 여부를 나타냅니다. 이 특수 플래그가 설정되면 메트릭을 계산할 때 이 특정 경계 상자를 건너뜁니다. `Optional, Bool`	`0`

개체 감지를 위한 JSONL 파일의 예:

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "can", "topX": 0.260, "topY": 0.406, "bottomX": 0.735, "bottomY": 0.701, "isCrowd": 0}]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "topX": 0.172, "topY": 0.153, "bottomX": 0.432, "bottomY": 0.659, "isCrowd": 0}, {"label": "milk_bottle", "topX": 0.300, "topY": 0.566, "bottomX": 0.891, "bottomY": 0.735, "isCrowd": 0}]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "topX": 0.0180, "topY": 0.297, "bottomX": 0.380, "bottomY": 0.836, "isCrowd": 0}, {"label": "milk_bottle", "topX": 0.454, "topY": 0.348, "bottomX": 0.613, "bottomY": 0.683, "isCrowd": 0}, {"label": "water_bottle", "topX": 0.667, "topY": 0.279, "bottomX": 0.841, "bottomY": 0.615, "isCrowd": 0}]}

개체 감지를 위한 이미지 예.

인스턴스 구분

예를 들어 세분화의 경우 자동화된 ML은 마스크가 아닌 입력 및 출력으로 다각형만 지원합니다.

다음은 인스턴스 분할을 위한 JSONL 파일의 예입니다.

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      {
         "label":"class_name",
         "isCrowd":"isCrowd",
         "polygon":[["x1", "y1", "x2", "y2", "x3", "y3", "...", "xn", "yn"]]
      }
   ]
}

키	설명	예시
`image_url`	Azure Machine Learning 데이터 저장소의 이미지 위치 `Required, String`	`"AmlDatastore://data_directory/Image_01.jpg"`
`image_details`	이미지 세부 정보 `Optional, Dictionary`	`"image_details":{"format": "jpg", "width": "400px", "height": "258px"}`
`format`	이미지 유형 `Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff" }`	`"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"`
`width`	이미지의 너비 `Optional, String or Positive Integer`	`"499px" or 499`
`height`	이미지의 높이 `Optional, String or Positive Integer`	`"665px" or 665`
`label`(외부 키)	각 마스크가 `label, isCrowd, polygon coordinates`의 사전인 마스크 목록 `Required, List of dictionaries`	`[{"label": "can", "isCrowd": 0, "polygon": [[0.577, 0.689,` `0.562, 0.681,` `0.559, 0.686]]}]`
`label`(내부 키)	마스크에 있는 개체의 클래스/레이블 `Required, String`	`"cat"`
`isCrowd`	마스크가 개체의 종류 주위에 있는지 여부를 나타냅니다. `Optional, Bool`	`0`
`polygon`	개체의 다각형 좌표 `Required, List of list for multiple segments of the same instance. Float values in the range [0,1]`	`[[0.577, 0.689, 0.567, 0.689, 0.559, 0.686]]`

인스턴스 분할을 위한 JSONL 파일의 예:

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "can", "isCrowd": 0, "polygon": [[0.577, 0.689, 0.567, 0.689, 0.559, 0.686, 0.380, 0.593, 0.304, 0.555, 0.294, 0.545, 0.290, 0.534, 0.274, 0.512, 0.2705, 0.496, 0.270, 0.478, 0.284, 0.453, 0.308, 0.432, 0.326, 0.423, 0.356, 0.415, 0.418, 0.417, 0.635, 0.493, 0.683, 0.507, 0.701, 0.518, 0.709, 0.528, 0.713, 0.545, 0.719, 0.554, 0.719, 0.579, 0.713, 0.597, 0.697, 0.621, 0.695, 0.629, 0.631, 0.678, 0.619, 0.683, 0.595, 0.683, 0.577, 0.689]]}]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "isCrowd": 0, "polygon": [[0.240, 0.65, 0.234, 0.654, 0.230, 0.647, 0.210, 0.512, 0.202, 0.403, 0.182, 0.267, 0.184, 0.243, 0.180, 0.166, 0.186, 0.159, 0.198, 0.156, 0.396, 0.162, 0.408, 0.169, 0.406, 0.217, 0.414, 0.249, 0.422, 0.262, 0.422, 0.569, 0.342, 0.569, 0.334, 0.572, 0.320, 0.585, 0.308, 0.624, 0.306, 0.648, 0.240, 0.657]]}, {"label": "milk_bottle",  "isCrowd": 0, "polygon": [[0.675, 0.732, 0.635, 0.731, 0.621, 0.725, 0.573, 0.717, 0.516, 0.717, 0.505, 0.720, 0.462, 0.722, 0.438, 0.719, 0.396, 0.719, 0.358, 0.714, 0.334, 0.714, 0.322, 0.711, 0.312, 0.701, 0.306, 0.687, 0.304, 0.663, 0.308, 0.630, 0.320, 0.596, 0.32, 0.588, 0.326, 0.579]]}]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "water_bottle", "isCrowd": 0, "polygon": [[0.334, 0.626, 0.304, 0.621, 0.254, 0.603, 0.164, 0.605, 0.158, 0.602, 0.146, 0.602, 0.142, 0.608, 0.094, 0.612, 0.084, 0.599, 0.080, 0.585, 0.080, 0.539, 0.082, 0.536, 0.092, 0.533, 0.126, 0.530, 0.132, 0.533, 0.144, 0.533, 0.162, 0.525, 0.172, 0.525, 0.186, 0.521, 0.196, 0.521 ]]}, {"label": "milk_bottle", "isCrowd": 0, "polygon": [[0.392, 0.773, 0.380, 0.732, 0.379, 0.767, 0.367, 0.755, 0.362, 0.735, 0.362, 0.714, 0.352, 0.644, 0.352, 0.611, 0.362, 0.597, 0.40, 0.593, 0.444,  0.494, 0.588, 0.515, 0.585, 0.621, 0.588, 0.671, 0.582, 0.713, 0.572, 0.753 ]]}]}

인스턴스 세분화를 위한 이미지 예.

유추를 위한 데이터 서식

이 섹션에서는 배포된 모델을 사용할 때 예측을 수행하는 데 필요한 입력 데이터 서식을 문서화합니다. 콘텐츠 형식이 application/octet-stream인 경우 앞서 언급한 모든 이미지 형식이 허용됩니다.

입력 형식

다음은 작업별 모델 엔드포인트를 사용하여 모든 작업에 대한 예측을 생성하는 데 필요한 입력 형식입니다. 모델을 배포한 후 다음 코드 조각을 사용하여 모든 작업에 대한 예측을 얻을 수 있습니다.

# input image for inference
sample_image = './test_image.jpg'
# load image data
data = open(sample_image, 'rb').read()
# set the content type
headers = {'Content-Type': 'application/octet-stream'}
# if authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {key}'
# make the request and display the response
response = requests.post(scoring_uri, data, headers=headers)

출력 형식

모델 엔드포인트에 대한 예측은 작업 종류에 따라 다른 구조를 따릅니다. 이 섹션에서는 다중 클래스, 다중 레이블 이미지 분류, 개체 감지 및 인스턴스 분할 작업을 위한 출력 데이터 서식을 살펴봅니다.

이미지 분류

이미지 분류를 위한 엔드포인트는 데이터 세트의 모든 레이블과 입력 이미지에 대한 확률 점수를 다음 형식으로 반환합니다.

{
   "filename":"/tmp/tmppjr4et28",
   "probs":[
      2.098e-06,
      4.783e-08,
      0.999,
      8.637e-06
   ],
   "labels":[
      "can",
      "carton",
      "milk_bottle",
      "water_bottle"
   ]
}

이미지 분류 다중 레이블

이미지 분류 다중 레이블의 경우 모델 엔드포인트는 레이블과 해당 확률을 반환합니다.

{
   "filename":"/tmp/tmpsdzxlmlm",
   "probs":[
      0.997,
      0.960,
      0.982,
      0.025
   ],
   "labels":[
      "can",
      "carton",
      "milk_bottle",
      "water_bottle"
   ]
}

개체 감지

개체 감지 모델은 상자 레이블 및 신뢰도 점수와 함께 크기가 조정된 왼쪽 위 및 오른쪽 아래 좌표가 있는 여러 상자를 반환합니다.

{
   "filename":"/tmp/tmpdkg2wkdy",
   "boxes":[
      {
         "box":{
            "topX":0.224,
            "topY":0.285,
            "bottomX":0.399,
            "bottomY":0.620
         },
         "label":"milk_bottle",
         "score":0.937
      },
      {
         "box":{
            "topX":0.664,
            "topY":0.484,
            "bottomX":0.959,
            "bottomY":0.812
         },
         "label":"can",
         "score":0.891
      },
      {
         "box":{
            "topX":0.423,
            "topY":0.253,
            "bottomX":0.632,
            "bottomY":0.725
         },
         "label":"water_bottle",
         "score":0.876
      }
   ]
}

인스턴스 구분

인스턴스 분할에서 출력은 크기가 조정된 왼쪽 상단 및 오른쪽 하단 좌표, 레이블, 신뢰도 점수 및 다각형(마스크 아님)이 있는 여러 상자로 구성됩니다. 여기에서 다각형 값은 스키마 섹션에서 논의한 것과 동일한 형식입니다.

{
   "filename":"/tmp/tmpi8604s0h",
   "boxes":[
      {
         "box":{
            "topX":0.679,
            "topY":0.491,
            "bottomX":0.926,
            "bottomY":0.810
         },
         "label":"can",
         "score":0.992,
         "polygon":[
            [
               0.82, 0.811, 0.771, 0.810, 0.758, 0.805, 0.741, 0.797, 0.735, 0.791, 0.718, 0.785, 0.715, 0.778, 0.706, 0.775, 0.696, 0.758, 0.695, 0.717, 0.698, 0.567, 0.705, 0.552, 0.706, 0.540, 0.725, 0.520, 0.735, 0.505, 0.745, 0.502, 0.755, 0.493
            ]
         ]
      },
      {
         "box":{
            "topX":0.220,
            "topY":0.298,
            "bottomX":0.397,
            "bottomY":0.601
         },
         "label":"milk_bottle",
         "score":0.989,
         "polygon":[
            [
               0.365, 0.602, 0.273, 0.602, 0.26, 0.595, 0.263, 0.588, 0.251, 0.546, 0.248, 0.501, 0.25, 0.485, 0.246, 0.478, 0.245, 0.463, 0.233, 0.442, 0.231, 0.43, 0.226, 0.423, 0.226, 0.408, 0.234, 0.385, 0.241, 0.371, 0.238, 0.345, 0.234, 0.335, 0.233, 0.325, 0.24, 0.305, 0.586, 0.38, 0.592, 0.375, 0.598, 0.365
            ]
         ]
      },
      {
         "box":{
            "topX":0.433,
            "topY":0.280,
            "bottomX":0.621,
            "bottomY":0.679
         },
         "label":"water_bottle",
         "score":0.988,
         "polygon":[
            [
               0.576, 0.680, 0.501, 0.680, 0.475, 0.675, 0.460, 0.625, 0.445, 0.630, 0.443, 0.572, 0.440, 0.560, 0.435, 0.515, 0.431, 0.501, 0.431, 0.433, 0.433, 0.426, 0.445, 0.417, 0.456, 0.407, 0.465, 0.381, 0.468, 0.327, 0.471, 0.318
            ]
         ]
      }
   ]
}

참고 항목

다음 단계

자동화된 ML로 Computer Vision 모델 학습을 위한 데이터 준비 방법에 대해 알아봅니다.
AutoML에서 Computer Vision 작업 설정
자습서: AutoML 및 Python을 사용하여 개체 감지 모델(미리 보기) 학습.

다음을 통해 공유

자동화된 Machine Learning으로 Computer Vision 모델을 학습하기 위한 데이터 스키마(v1)

학습용 데이터 스키마

이미지 분류(이진/다중 클래스)

이미지 분류 다중 레이블

개체 감지

인스턴스 구분

유추를 위한 데이터 서식

입력 형식

출력 형식

이미지 분류

이미지 분류 다중 레이블

개체 감지

인스턴스 구분

다음 단계

피드백

추가 리소스