Microsoft.MachineLearningServices workspaces/datasets 2020-05-01-preview

Article
12/09/2024

Bicep resource definition

The workspaces/datasets resource type can be deployed with operations that target:

Resource groups - See resource group deployment commands

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/datasets resource, add the following Bicep to your template.

resource symbolicname 'Microsoft.MachineLearningServices/workspaces/datasets@2020-05-01-preview' = {
  parent: resourceSymbolicName
  datasetType: 'string'
  name: 'string'
  parameters: {
    header: 'string'
    includePath: bool
    partitionFormat: 'string'
    path: {
      dataPath: {
        datastoreName: 'string'
        relativePath: 'string'
      }
      httpUrl: 'string'
    }
    query: {
      datastoreName: 'string'
      query: 'string'
    }
    separator: 'string'
    sourceType: 'string'
  }
  registration: {
    description: 'string'
    name: 'string'
    tags: {
      {customized property}: 'string'
    }
  }
  skipValidation: bool
  timeSeries: {
    coarseGrainTimestamp: 'string'
    fineGrainTimestamp: 'string'
  }
}

Property values

DatasetCreateRequestParameters

Name	Description	Value
header	Header type.	'all_files_have_same_headers' 'combine_all_files_headers' 'no_headers' 'only_first_file_has_headers'
includePath	Boolean to keep path information as column in the dataset. Defaults to False. This is useful when reading multiple files, and want to know which file a particular record originated from, or to keep useful information in file path.	bool
partitionFormat	The partition information of each path will be extracted into columns based on the specified format. Format part '{column_name}' creates string column, and '{column_name:yyyy/MM/dd/HH/mm/ss}' creates datetime column, where 'yyyy', 'MM', 'dd', 'HH', 'mm' and 'ss' are used to extract year, month, day, hour, minute and second for the datetime type. The format should start from the position of first partition key until the end of file path. For example, given the path '../USA/2019/01/01/data.parquet' where the partition is by country/region and time, partition_format='/{CountryOrRegion}/{PartitionDate:yyyy/MM/dd}/data.csv' creates a string column 'CountryOrRegion' with the value 'USA' and a datetime column 'PartitionDate' with the value '2019-01-01	string
path		DatasetCreateRequestParametersPath
query		DatasetCreateRequestParametersQuery
separator	The separator used to split columns for 'delimited_files' sourceType.	string
sourceType	Data source type.	'delimited_files' 'json_lines_files' 'parquet_files'

DatasetCreateRequestParametersPath

Name	Description	Value
dataPath		DatasetCreateRequestParametersPathDataPath
httpUrl	The Http URL.	string

DatasetCreateRequestParametersPathDataPath

Name	Description	Value
datastoreName	The datastore name.	string
relativePath	Path within the datastore.	string

DatasetCreateRequestParametersQuery

Name	Description	Value
datastoreName	The SQL/PostgreSQL/MySQL datastore name.	string
query	SQL Quey.	string

DatasetCreateRequestRegistration

Name	Description	Value
description	The description for the dataset.	string
name	The name of the dataset.	string
tags	Tags associated with the dataset.	DatasetCreateRequestRegistrationTags

DatasetCreateRequestRegistrationTags

Name	Description	Value

DatasetCreateRequestTimeSeries

Name	Description	Value
coarseGrainTimestamp	Column name to be used as CoarseGrainTimestamp. Can only be used if 'fineGrainTimestamp' is specified and cannot be same as 'fineGrainTimestamp'.	string
fineGrainTimestamp	Column name to be used as FineGrainTimestamp	string

Microsoft.MachineLearningServices/workspaces/datasets

Name	Description	Value
datasetType	Specifies dataset type.	'file' 'tabular' (required)
name	The resource name	string (required)
parameters		DatasetCreateRequestParameters (required)
parent	In Bicep, you can specify the parent resource for a child resource. You only need to add this property when the child resource is declared outside of the parent resource. For more information, see Child resource outside parent resource.	Symbolic name for resource of type: workspaces
registration		DatasetCreateRequestRegistration (required)
skipValidation	Skip validation that ensures data can be loaded from the dataset before registration.	bool
timeSeries		DatasetCreateRequestTimeSeries

ARM template resource definition

The workspaces/datasets resource type can be deployed with operations that target:

Resource groups - See resource group deployment commands

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/datasets resource, add the following JSON to your template.

{
  "type": "Microsoft.MachineLearningServices/workspaces/datasets",
  "apiVersion": "2020-05-01-preview",
  "name": "string",
  "datasetType": "string",
  "parameters": {
    "header": "string",
    "includePath": "bool",
    "partitionFormat": "string",
    "path": {
      "dataPath": {
        "datastoreName": "string",
        "relativePath": "string"
      },
      "httpUrl": "string"
    },
    "query": {
      "datastoreName": "string",
      "query": "string"
    },
    "separator": "string",
    "sourceType": "string"
  },
  "registration": {
    "description": "string",
    "name": "string",
    "tags": {
      "{customized property}": "string"
    }
  },
  "skipValidation": "bool",
  "timeSeries": {
    "coarseGrainTimestamp": "string",
    "fineGrainTimestamp": "string"
  }
}

Property values

DatasetCreateRequestParameters

Name	Description	Value
header	Header type.	'all_files_have_same_headers' 'combine_all_files_headers' 'no_headers' 'only_first_file_has_headers'
includePath	Boolean to keep path information as column in the dataset. Defaults to False. This is useful when reading multiple files, and want to know which file a particular record originated from, or to keep useful information in file path.	bool
partitionFormat	The partition information of each path will be extracted into columns based on the specified format. Format part '{column_name}' creates string column, and '{column_name:yyyy/MM/dd/HH/mm/ss}' creates datetime column, where 'yyyy', 'MM', 'dd', 'HH', 'mm' and 'ss' are used to extract year, month, day, hour, minute and second for the datetime type. The format should start from the position of first partition key until the end of file path. For example, given the path '../USA/2019/01/01/data.parquet' where the partition is by country/region and time, partition_format='/{CountryOrRegion}/{PartitionDate:yyyy/MM/dd}/data.csv' creates a string column 'CountryOrRegion' with the value 'USA' and a datetime column 'PartitionDate' with the value '2019-01-01	string
path		DatasetCreateRequestParametersPath
query		DatasetCreateRequestParametersQuery
separator	The separator used to split columns for 'delimited_files' sourceType.	string
sourceType	Data source type.	'delimited_files' 'json_lines_files' 'parquet_files'

DatasetCreateRequestParametersPath

Name	Description	Value
dataPath		DatasetCreateRequestParametersPathDataPath
httpUrl	The Http URL.	string

DatasetCreateRequestParametersPathDataPath

Name	Description	Value
datastoreName	The datastore name.	string
relativePath	Path within the datastore.	string

DatasetCreateRequestParametersQuery

Name	Description	Value
datastoreName	The SQL/PostgreSQL/MySQL datastore name.	string
query	SQL Quey.	string

DatasetCreateRequestRegistration

Name	Description	Value
description	The description for the dataset.	string
name	The name of the dataset.	string
tags	Tags associated with the dataset.	DatasetCreateRequestRegistrationTags

DatasetCreateRequestRegistrationTags

Name	Description	Value

DatasetCreateRequestTimeSeries

Name	Description	Value
coarseGrainTimestamp	Column name to be used as CoarseGrainTimestamp. Can only be used if 'fineGrainTimestamp' is specified and cannot be same as 'fineGrainTimestamp'.	string
fineGrainTimestamp	Column name to be used as FineGrainTimestamp	string

Microsoft.MachineLearningServices/workspaces/datasets

Name	Description	Value
apiVersion	The api version	'2020-05-01-preview'
datasetType	Specifies dataset type.	'file' 'tabular' (required)
name	The resource name	string (required)
parameters		DatasetCreateRequestParameters (required)
registration		DatasetCreateRequestRegistration (required)
skipValidation	Skip validation that ensures data can be loaded from the dataset before registration.	bool
timeSeries		DatasetCreateRequestTimeSeries
type	The resource type	'Microsoft.MachineLearningServices/workspaces/datasets'

Quickstart templates

The following quickstart templates deploy this resource type.

Template	Description
Create AML workspace with multiple Datasets & Datastores	This template creates Azure Machine Learning workspace with multiple datasets & datastores.
Create File Dataset from Relative Path in Datastore	This template creates a file dataset from relative path in datastore in Azure Machine Learning workspace.
Create File Dataset in AML workspace from Web URL	This template creates a file dataset from Web URL in Azure Machine Learning workspace.
Create Tabular Dataset from Relative Path in Datastore	This template creates a tabular dataset from relative path in datastore in Azure Machine Learning workspace.
Create Tabular Dataset from SQL/PostgreSQL/MySQL Datastore	This template creates a tabular dataset from SQL query in SQL/PostgreSQL/MySQL datastore in Azure Machine Learning workspace.
Create Tabular Dataset in AML workspace from Web URL	This template creates a tabular dataset from Web URL in Azure Machine Learning workspace.

Terraform (AzAPI provider) resource definition

The workspaces/datasets resource type can be deployed with operations that target:

Resource groups

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.MachineLearningServices/workspaces/datasets resource, add the following Terraform to your template.

resource "azapi_resource" "symbolicname" {
  type = "Microsoft.MachineLearningServices/workspaces/datasets@2020-05-01-preview"
  name = "string"
  datasetType = "string"
  parameters = {
    header = "string"
    includePath = bool
    partitionFormat = "string"
    path = {
      dataPath = {
        datastoreName = "string"
        relativePath = "string"
      }
      httpUrl = "string"
    }
    query = {
      datastoreName = "string"
      query = "string"
    }
    separator = "string"
    sourceType = "string"
  }
  registration = {
    description = "string"
    name = "string"
    tags = {
      {customized property} = "string"
    }
  }
  skipValidation = bool
  timeSeries = {
    coarseGrainTimestamp = "string"
    fineGrainTimestamp = "string"
  }
}

Property values

DatasetCreateRequestParameters

Name	Description	Value
header	Header type.	'all_files_have_same_headers' 'combine_all_files_headers' 'no_headers' 'only_first_file_has_headers'
includePath	Boolean to keep path information as column in the dataset. Defaults to False. This is useful when reading multiple files, and want to know which file a particular record originated from, or to keep useful information in file path.	bool
partitionFormat	The partition information of each path will be extracted into columns based on the specified format. Format part '{column_name}' creates string column, and '{column_name:yyyy/MM/dd/HH/mm/ss}' creates datetime column, where 'yyyy', 'MM', 'dd', 'HH', 'mm' and 'ss' are used to extract year, month, day, hour, minute and second for the datetime type. The format should start from the position of first partition key until the end of file path. For example, given the path '../USA/2019/01/01/data.parquet' where the partition is by country/region and time, partition_format='/{CountryOrRegion}/{PartitionDate:yyyy/MM/dd}/data.csv' creates a string column 'CountryOrRegion' with the value 'USA' and a datetime column 'PartitionDate' with the value '2019-01-01	string
path		DatasetCreateRequestParametersPath
query		DatasetCreateRequestParametersQuery
separator	The separator used to split columns for 'delimited_files' sourceType.	string
sourceType	Data source type.	'delimited_files' 'json_lines_files' 'parquet_files'

DatasetCreateRequestParametersPath

Name	Description	Value
dataPath		DatasetCreateRequestParametersPathDataPath
httpUrl	The Http URL.	string

DatasetCreateRequestParametersPathDataPath

Name	Description	Value
datastoreName	The datastore name.	string
relativePath	Path within the datastore.	string

DatasetCreateRequestParametersQuery

Name	Description	Value
datastoreName	The SQL/PostgreSQL/MySQL datastore name.	string
query	SQL Quey.	string

DatasetCreateRequestRegistration

Name	Description	Value
description	The description for the dataset.	string
name	The name of the dataset.	string
tags	Tags associated with the dataset.	DatasetCreateRequestRegistrationTags

DatasetCreateRequestRegistrationTags

Name	Description	Value

DatasetCreateRequestTimeSeries

Name	Description	Value
coarseGrainTimestamp	Column name to be used as CoarseGrainTimestamp. Can only be used if 'fineGrainTimestamp' is specified and cannot be same as 'fineGrainTimestamp'.	string
fineGrainTimestamp	Column name to be used as FineGrainTimestamp	string

Microsoft.MachineLearningServices/workspaces/datasets

Name	Description	Value
datasetType	Specifies dataset type.	'file' 'tabular' (required)
name	The resource name	string (required)
parameters		DatasetCreateRequestParameters (required)
parent_id	The ID of the resource that is the parent for this resource.	ID for resource of type: workspaces
registration		DatasetCreateRequestRegistration (required)
skipValidation	Skip validation that ensures data can be loaded from the dataset before registration.	bool
timeSeries		DatasetCreateRequestTimeSeries
type	The resource type	"Microsoft.MachineLearningServices/workspaces/datasets@2020-05-01-preview"

Share via

Microsoft.MachineLearningServices workspaces/datasets 2020-05-01-preview

Bicep resource definition

Resource format

Property values

DatasetCreateRequestParameters

DatasetCreateRequestParametersPath

DatasetCreateRequestParametersPathDataPath

DatasetCreateRequestParametersQuery

DatasetCreateRequestRegistration

DatasetCreateRequestRegistrationTags

DatasetCreateRequestTimeSeries

Microsoft.MachineLearningServices/workspaces/datasets

ARM template resource definition

Resource format

Property values

DatasetCreateRequestParameters

DatasetCreateRequestParametersPath

DatasetCreateRequestParametersPathDataPath

DatasetCreateRequestParametersQuery

DatasetCreateRequestRegistration

DatasetCreateRequestRegistrationTags

DatasetCreateRequestTimeSeries

Microsoft.MachineLearningServices/workspaces/datasets

Quickstart templates

Terraform (AzAPI provider) resource definition

Resource format

Property values

DatasetCreateRequestParameters

DatasetCreateRequestParametersPath

DatasetCreateRequestParametersPathDataPath

DatasetCreateRequestParametersQuery

DatasetCreateRequestRegistration

DatasetCreateRequestRegistrationTags

DatasetCreateRequestTimeSeries

Microsoft.MachineLearningServices/workspaces/datasets

Feedback

Additional resources