Microsoft.HDInsight clusterpools/clusters 2023-11-01-preview

Bicep resource definition

The clusterpools/clusters resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.HDInsight/clusterpools/clusters resource, add the following Bicep to your template.

resource symbolicname 'Microsoft.HDInsight/clusterpools/clusters@2023-11-01-preview' = {
  parent: resourceSymbolicName
  location: 'string'
  name: 'string'
  properties: {
    clusterProfile: {
      authorizationProfile: {
        groupIds: [
          'string'
        ]
        userIds: [
          'string'
        ]
      }
      autoscaleProfile: {
        autoscaleType: 'string'
        enabled: bool
        gracefulDecommissionTimeout: int
        loadBasedConfig: {
          cooldownPeriod: int
          maxNodes: int
          minNodes: int
          pollInterval: int
          scalingRules: [
            {
              actionType: 'string'
              comparisonRule: {
                operator: 'string'
                threshold: int
              }
              evaluationCount: int
              scalingMetric: 'string'
            }
          ]
        }
        scheduleBasedConfig: {
          defaultCount: int
          schedules: [
            {
              count: int
              days: [
                'string'
              ]
              endTime: 'string'
              startTime: 'string'
            }
          ]
          timeZone: 'string'
        }
      }
      clusterAccessProfile: {
        enableInternalIngress: bool
      }
      clusterVersion: 'string'
      flinkProfile: {
        catalogOptions: {
          hive: {
            metastoreDbConnectionAuthenticationMode: 'string'
            metastoreDbConnectionPasswordSecret: 'string'
            metastoreDbConnectionURL: 'string'
            metastoreDbConnectionUserName: 'string'
          }
        }
        deploymentMode: 'string'
        historyServer: {
          cpu: int
          memory: int
        }
        jobManager: {
          cpu: int
          memory: int
        }
        jobSpec: {
          args: 'string'
          entryClass: 'string'
          jarName: 'string'
          jobJarDirectory: 'string'
          savePointName: 'string'
          upgradeMode: 'string'
        }
        numReplicas: int
        storage: {
          storagekey: 'string'
          storageUri: 'string'
        }
        taskManager: {
          cpu: int
          memory: int
        }
      }
      identityProfile: {
        msiClientId: 'string'
        msiObjectId: 'string'
        msiResourceId: 'string'
      }
      kafkaProfile: {
        diskStorage: {
          dataDiskSize: int
          dataDiskType: 'string'
        }
        enableKRaft: bool
        enablePublicEndpoints: bool
        remoteStorageUri: 'string'
      }
      llapProfile: {
        {customized property}: any(Azure.Bicep.Types.Concrete.AnyType)
      }
      logAnalyticsProfile: {
        applicationLogs: {
          stdErrorEnabled: bool
          stdOutEnabled: bool
        }
        enabled: bool
        metricsEnabled: bool
      }
      ossVersion: 'string'
      prometheusProfile: {
        enabled: bool
      }
      rangerPluginProfile: {
        enabled: bool
      }
      rangerProfile: {
        rangerAdmin: {
          admins: [
            'string'
          ]
          database: {
            host: 'string'
            name: 'string'
            passwordSecretRef: 'string'
            username: 'string'
          }
        }
        rangerAudit: {
          storageAccount: 'string'
        }
        rangerUsersync: {
          enabled: bool
          groups: [
            'string'
          ]
          mode: 'string'
          userMappingLocation: 'string'
          users: [
            'string'
          ]
        }
      }
      scriptActionProfiles: [
        {
          name: 'string'
          parameters: 'string'
          services: [
            'string'
          ]
          shouldPersist: bool
          timeoutInMinutes: int
          type: 'string'
          url: 'string'
        }
      ]
      secretsProfile: {
        keyVaultResourceId: 'string'
        secrets: [
          {
            keyVaultObjectName: 'string'
            referenceName: 'string'
            type: 'string'
            version: 'string'
          }
        ]
      }
      serviceConfigsProfiles: [
        {
          configs: [
            {
              component: 'string'
              files: [
                {
                  content: 'string'
                  encoding: 'string'
                  fileName: 'string'
                  path: 'string'
                  values: {
                    {customized property}: 'string'
                  }
                }
              ]
            }
          ]
          serviceName: 'string'
        }
      ]
      sparkProfile: {
        defaultStorageUrl: 'string'
        metastoreSpec: {
          dbConnectionAuthenticationMode: 'string'
          dbName: 'string'
          dbPasswordSecretName: 'string'
          dbServerHost: 'string'
          dbUserName: 'string'
          keyVaultId: 'string'
          thriftUrl: 'string'
        }
        userPluginsSpec: {
          plugins: [
            {
              path: 'string'
            }
          ]
        }
      }
      sshProfile: {
        count: int
      }
      stubProfile: {
        {customized property}: any(Azure.Bicep.Types.Concrete.AnyType)
      }
      trinoProfile: {
        catalogOptions: {
          hive: [
            {
              catalogName: 'string'
              metastoreDbConnectionAuthenticationMode: 'string'
              metastoreDbConnectionPasswordSecret: 'string'
              metastoreDbConnectionURL: 'string'
              metastoreDbConnectionUserName: 'string'
              metastoreWarehouseDir: 'string'
            }
          ]
        }
        coordinator: {
          debug: {
            enable: bool
            port: int
            suspend: bool
          }
          highAvailabilityEnabled: bool
        }
        userPluginsSpec: {
          plugins: [
            {
              enabled: bool
              name: 'string'
              path: 'string'
            }
          ]
        }
        userTelemetrySpec: {
          storage: {
            hivecatalogName: 'string'
            hivecatalogSchema: 'string'
            partitionRetentionInDays: int
            path: 'string'
          }
        }
        worker: {
          debug: {
            enable: bool
            port: int
            suspend: bool
          }
        }
      }
    }
    clusterType: 'string'
    computeProfile: {
      nodes: [
        {
          count: int
          type: 'string'
          vmSize: 'string'
        }
      ]
    }
  }
  tags: {
    {customized property}: 'string'
  }
}

Property values

AuthorizationProfile

Name Description Value
groupIds AAD group Ids authorized for data plane access. string[]
userIds AAD user Ids authorized for data plane access. string[]

AutoscaleProfile

Name Description Value
autoscaleType User to specify which type of Autoscale to be implemented - Scheduled Based or Load Based. 'LoadBased'
'ScheduleBased'
enabled This indicates whether auto scale is enabled on HDInsight on AKS cluster. bool (required)
gracefulDecommissionTimeout This property is for graceful decommission timeout; It has a default setting of 3600 seconds before forced shutdown takes place. This is the maximal time to wait for running containers and applications to complete before transition a DECOMMISSIONING node into DECOMMISSIONED. The default value is 3600 seconds. Negative value (like -1) is handled as infinite timeout. int
loadBasedConfig Profiles of load based Autoscale. LoadBasedConfig
scheduleBasedConfig Profiles of schedule based Autoscale. ScheduleBasedConfig

CatalogOptions

Name Description Value
hive hive catalog options. HiveCatalogOption[]

ClusterAccessProfile

Name Description Value
enableInternalIngress Whether to create cluster using private IP instead of public IP. This property must be set at create time. bool (required)

ClusterConfigFile

Name Description Value
content Free form content of the entire configuration file. string
encoding This property indicates if the content is encoded and is case-insensitive. Please set the value to base64 if the content is base64 encoded. Set it to none or skip it if the content is plain text. 'Base64'
'None'
fileName Configuration file name. string (required)
path Path of the config file if content is specified. string
values List of key value pairs
where key represents a valid service configuration name and value represents the value of the config.
ClusterConfigFileValues

ClusterConfigFileValues

Name Description Value

ClusterLogAnalyticsApplicationLogs

Name Description Value
stdErrorEnabled True if stderror is enabled, otherwise false. bool
stdOutEnabled True if stdout is enabled, otherwise false. bool

ClusterLogAnalyticsProfile

Name Description Value
applicationLogs Collection of logs to be enabled or disabled for log analytics. ClusterLogAnalyticsApplicationLogs
enabled True if log analytics is enabled for the cluster, otherwise false. bool (required)
metricsEnabled True if metrics are enabled, otherwise false. bool

ClusterProfile

Name Description Value
authorizationProfile Authorization profile with details of AAD user Ids and group Ids authorized for data plane access. AuthorizationProfile (required)
autoscaleProfile This is the Autoscale profile for the cluster. This will allow customer to create cluster enabled with Autoscale. AutoscaleProfile
clusterAccessProfile Cluster access profile. ClusterAccessProfile
clusterVersion Version with 3/4 part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})(?:\.(0|[1-9][0-9]{0,18}))?$ (required)
flinkProfile The Flink cluster profile. FlinkProfile
identityProfile This property is required by Trino, Spark and Flink cluster but is optional for Kafka cluster. IdentityProfile
kafkaProfile The Kafka cluster profile. KafkaProfile
llapProfile LLAP cluster profile. ClusterProfileLlapProfile
logAnalyticsProfile Cluster log analytics profile to enable or disable OMS agent for cluster. ClusterLogAnalyticsProfile
ossVersion Version with three part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})$ (required)
prometheusProfile Cluster Prometheus profile. ClusterPrometheusProfile
rangerPluginProfile Cluster Ranger plugin profile. ClusterRangerPluginProfile
rangerProfile The ranger cluster profile. RangerProfile
scriptActionProfiles The script action profile list. ScriptActionProfile[]
secretsProfile The cluster secret profile. SecretsProfile
serviceConfigsProfiles The service configs profiles. ClusterServiceConfigsProfile[]
sparkProfile The spark cluster profile. SparkProfile
sshProfile Ssh profile for the cluster. SshProfile
stubProfile Stub cluster profile. ClusterProfileStubProfile
trinoProfile Trino Cluster profile. TrinoProfile

ClusterProfileLlapProfile

Name Description Value

ClusterProfileStubProfile

Name Description Value

ClusterPrometheusProfile

Name Description Value
enabled Enable Prometheus for cluster or not. bool (required)

ClusterRangerPluginProfile

Name Description Value
enabled Enable Ranger for cluster or not. bool (required)

ClusterResourceProperties

Name Description Value
clusterProfile Cluster profile. ClusterProfile (required)
clusterType The type of cluster. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9]{0,31}$ (required)
computeProfile The compute profile. ComputeProfile (required)

ClusterServiceConfig

Name Description Value
component Name of the component the config files should apply to. string (required)
files List of Config Files. ClusterConfigFile[] (required)

ClusterServiceConfigsProfile

Name Description Value
configs List of service configs. ClusterServiceConfig[] (required)
serviceName Name of the service the configurations should apply to. string (required)

ComparisonRule

Name Description Value
operator The comparison operator. 'greaterThan'
'greaterThanOrEqual'
'lessThan'
'lessThanOrEqual' (required)
threshold Threshold setting. int (required)

ComputeProfile

Name Description Value
nodes The nodes definitions. NodeProfile[] (required)

ComputeResourceDefinition

Name Description Value
cpu The required CPU. int (required)
memory The required memory in MB, Container memory will be 110 percentile int (required)

DiskStorageProfile

Name Description Value
dataDiskSize Managed Disk size in GB. The maximum supported disk size for Standard and Premium HDD/SSD is 32TB, except for Premium SSD v2, which supports up to 64TB. int (required)
dataDiskType Managed Disk Type. 'Premium_SSD_LRS'
'Premium_SSD_v2_LRS'
'Premium_SSD_ZRS'
'Standard_HDD_LRS'
'Standard_SSD_LRS'
'Standard_SSD_ZRS' (required)

FlinkCatalogOptions

Name Description Value
hive Hive Catalog Option for Flink cluster. FlinkHiveCatalogOption

FlinkHiveCatalogOption

Name Description Value
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string

FlinkJobProfile

Name Description Value
args A string property representing additional JVM arguments for the Flink job. It should be space separated value. string
entryClass A string property that specifies the entry class for the Flink job. If not specified, the entry point is auto-detected from the flink job jar package. string
jarName A string property that represents the name of the job JAR. string (required)
jobJarDirectory A string property that specifies the directory where the job JAR is located. string (required)
savePointName A string property that represents the name of the savepoint for the Flink job string
upgradeMode A string property that indicates the upgrade mode to be performed on the Flink job. It can have one of the following enum values => STATELESS_UPDATE, UPDATE, LAST_STATE_UPDATE. 'LAST_STATE_UPDATE'
'STATELESS_UPDATE'
'UPDATE' (required)

FlinkProfile

Name Description Value
catalogOptions Flink cluster catalog options. FlinkCatalogOptions
deploymentMode A string property that indicates the deployment mode of Flink cluster. It can have one of the following enum values => Application, Session. Default value is Session 'Application'
'Session'
historyServer History Server container/ process CPU and memory requirements ComputeResourceDefinition
jobManager Job Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)
jobSpec Job specifications for flink clusters in application deployment mode. The specification is immutable even if job properties are changed by calling the RunJob API, please use the ListJob API to get the latest job information. FlinkJobProfile
numReplicas The number of task managers. int
storage The storage profile FlinkStorageProfile (required)
taskManager Task Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)

FlinkStorageProfile

Name Description Value
storagekey Storage key is only required for wasb(s) storage. string

Constraints:
Sensitive value. Pass in as a secure parameter.
storageUri Storage account uri which is used for savepoint and checkpoint state. string

Constraints:
Pattern = ^(\w{4,5})://(.*)@(.*).\b(blob|dfs)\b.*$ (required)

HiveCatalogOption

Name Description Value
catalogName Name of trino catalog which should use specified hive metastore. string

Constraints:
Min length = 1 (required)
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string
metastoreWarehouseDir Metastore root directory URI, format: abfs[s]://<container>@<account_name>.dfs.core.windows.net/<path>. More details: /azure/storage/blobs/data-lake-storage-introduction-abfs-uri string (required)

IdentityProfile

Name Description Value
msiClientId ClientId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiObjectId ObjectId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiResourceId ResourceId of the MSI. string (required)

KafkaProfile

Name Description Value
diskStorage Kafka disk storage profile. DiskStorageProfile (required)
enableKRaft Expose Kafka cluster in KRaft mode. bool
enablePublicEndpoints Expose worker nodes as public endpoints. bool
remoteStorageUri Fully qualified path of Azure Storage container used for Tiered Storage. string

Constraints:
Pattern = ^(https?|abfss?):\/\/[^/]+(?:\/|$)

LoadBasedConfig

Name Description Value
cooldownPeriod This is a cool down period, this is a time period in seconds, which determines the amount of time that must elapse between a scaling activity started by a rule and the start of the next scaling activity, regardless of the rule that triggers it. The default value is 300 seconds. int
maxNodes User needs to set the maximum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
minNodes User needs to set the minimum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
pollInterval User can specify the poll interval, this is the time period (in seconds) after which scaling metrics are polled for triggering a scaling operation. int
scalingRules The scaling rules. ScalingRule[] (required)

Microsoft.HDInsight/clusterpools/clusters

Name Description Value
location The geo-location where the resource lives string (required)
name The resource name string (required)
parent In Bicep, you can specify the parent resource for a child resource. You only need to add this property when the child resource is declared outside of the parent resource.

For more information, see Child resource outside parent resource.
Symbolic name for resource of type: clusterpools
properties Gets or sets the properties. Define cluster specific properties. ClusterResourceProperties
tags Resource tags Dictionary of tag names and values. See Tags in templates

NodeProfile

Name Description Value
count The number of virtual machines. int

Constraints:
Min value = 1 (required)
type The node type. string

Constraints:
Pattern = ^(head|Head|HEAD|worker|Worker|WORKER)$ (required)
vmSize The virtual machine SKU. string

Constraints:
Pattern = ^[a-zA-Z0-9_\-]{0,256}$ (required)

RangerAdminSpec

Name Description Value
admins List of usernames that should be marked as ranger admins. These usernames should match the user principal name (UPN) of the respective AAD users. string[] (required)
database RangerAdminSpecDatabase (required)

RangerAdminSpecDatabase

Name Description Value
host The database URL string (required)
name The database name string (required)
passwordSecretRef Reference for the database password string
username The name of the database user string

RangerAuditSpec

Name Description Value
storageAccount Azure storage location of the blobs. MSI should have read/write access to this Storage account. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

RangerProfile

Name Description Value
rangerAdmin Specification for the Ranger Admin service. RangerAdminSpec (required)
rangerAudit Properties required to describe audit log storage. RangerAuditSpec
rangerUsersync Specification for the Ranger Usersync service RangerUsersyncSpec (required)

RangerUsersyncSpec

Name Description Value
enabled Denotes whether usersync service should be enabled bool
groups List of groups that should be synced. These group names should match the object id of the respective AAD groups. string[]
mode User & groups can be synced automatically or via a static list that's refreshed. 'automatic'
'static'
userMappingLocation Azure storage location of a mapping file that lists user & group associations. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$
users List of user names that should be synced. These usernames should match the User principal name of the respective AAD users. string[]

ScalingRule

Name Description Value
actionType The action type. 'scaledown'
'scaleup' (required)
comparisonRule The comparison rule. ComparisonRule (required)
evaluationCount This is an evaluation count for a scaling condition, the number of times a trigger condition should be successful, before scaling activity is triggered. int (required)
scalingMetric Metrics name for individual workloads. For example: cpu string (required)

Schedule

Name Description Value
count User has to set the node count anticipated at end of the scaling operation of the set current schedule configuration, format is integer. int (required)
days User has to set the days where schedule has to be set for autoscale operation. String array containing any of:
'Friday'
'Monday'
'Saturday'
'Sunday'
'Thursday'
'Tuesday'
'Wednesday' (required)
endTime User has to set the end time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)
startTime User has to set the start time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)

ScheduleBasedConfig

Name Description Value
defaultCount Setting default node count of current schedule configuration. Default node count specifies the number of nodes which are default when an specified scaling operation is executed (scale up/scale down) int (required)
schedules This specifies the schedules where scheduled based Autoscale to be enabled, the user has a choice to set multiple rules within the schedule across days and times (start/end). Schedule[] (required)
timeZone User has to specify the timezone on which the schedule has to be set for schedule based autoscale configuration. string (required)

ScriptActionProfile

Name Description Value
name Script name. string (required)
parameters Additional parameters for the script action. It should be space-separated list of arguments required for script execution. string
services List of services to apply the script action. string[] (required)
shouldPersist Specify if the script should persist on the cluster. bool
timeoutInMinutes Timeout duration for the script action in minutes. int
type Type of the script action. Supported type is bash scripts. string (required)
url Url of the script file. string

Constraints:
Pattern = ^(https)|(http)://.*$ (required)

SecretReference

Name Description Value
keyVaultObjectName Object identifier name of the secret in key vault. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9-]{1,126}$ (required)
referenceName Reference name of the secret to be used in service configs. string (required)
type Type of key vault object: secret, key or certificate. 'Certificate'
'Key'
'Secret' (required)
version Version of the secret in key vault. string

SecretsProfile

Name Description Value
keyVaultResourceId Name of the user Key Vault where all the cluster specific user secrets are stored. string (required)
secrets Properties of Key Vault secret. SecretReference[]

SparkMetastoreSpec

Name Description Value
dbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
dbName The database name. string (required)
dbPasswordSecretName The secret name which contains the database user password. string
dbServerHost The database server host. string (required)
dbUserName The database user name. string
keyVaultId The key vault resource id. string
thriftUrl The thrift url. string

SparkProfile

Name Description Value
defaultStorageUrl The default storage URL. string
metastoreSpec The metastore specification for Spark cluster. SparkMetastoreSpec
userPluginsSpec Spark user plugins spec SparkUserPlugins

SparkUserPlugin

Name Description Value
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$ (required)

SparkUserPlugins

Name Description Value
plugins Spark user plugins. SparkUserPlugin[]

SshProfile

Name Description Value
count Number of ssh pods per cluster. int

Constraints:
Min value = 0
Max value = 5 (required)

TrackedResourceTags

Name Description Value

TrinoCoordinator

Name Description Value
debug Trino debug configuration. TrinoDebugConfig
highAvailabilityEnabled The flag that if enable coordinator HA, uses multiple coordinator replicas with auto failover, one per each head node. Default: true. bool

TrinoDebugConfig

Name Description Value
enable The flag that if enable debug or not. bool
port The debug port. int
suspend The flag that if suspend debug or not. bool

TrinoProfile

Name Description Value
catalogOptions Trino cluster catalog options. CatalogOptions
coordinator Trino Coordinator. TrinoCoordinator
userPluginsSpec Trino user plugins spec TrinoUserPlugins
userTelemetrySpec User telemetry TrinoUserTelemetry
worker Trino worker. TrinoWorker

TrinoTelemetryConfig

Name Description Value
hivecatalogName Hive Catalog name used to mount external tables on the logs written by trino, if not specified there tables are not created. string

Constraints:
Min length = 1
hivecatalogSchema Schema of the above catalog to use, to mount query logs as external tables, if not specified tables will be mounted under schema trinologs. string
partitionRetentionInDays Retention period for query log table partitions, this doesn't have any affect on actual data. int
path Azure storage location of the blobs. string

Constraints:
Min length = 1

TrinoUserPlugin

Name Description Value
enabled Denotes whether the plugin is active or not. bool
name This field maps to the sub-directory in trino plugins location, that will contain all the plugins under path. string

Constraints:
Min length = 1
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

TrinoUserPlugins

Name Description Value
plugins Trino user plugins. TrinoUserPlugin[]

TrinoUserTelemetry

Name Description Value
storage Trino user telemetry definition. TrinoTelemetryConfig

TrinoWorker

Name Description Value
debug Trino debug configuration. TrinoDebugConfig

ARM template resource definition

The clusterpools/clusters resource type can be deployed with operations that target:

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.HDInsight/clusterpools/clusters resource, add the following JSON to your template.

{
  "type": "Microsoft.HDInsight/clusterpools/clusters",
  "apiVersion": "2023-11-01-preview",
  "name": "string",
  "location": "string",
  "properties": {
    "clusterProfile": {
      "authorizationProfile": {
        "groupIds": [ "string" ],
        "userIds": [ "string" ]
      },
      "autoscaleProfile": {
        "autoscaleType": "string",
        "enabled": "bool",
        "gracefulDecommissionTimeout": "int",
        "loadBasedConfig": {
          "cooldownPeriod": "int",
          "maxNodes": "int",
          "minNodes": "int",
          "pollInterval": "int",
          "scalingRules": [
            {
              "actionType": "string",
              "comparisonRule": {
                "operator": "string",
                "threshold": "int"
              },
              "evaluationCount": "int",
              "scalingMetric": "string"
            }
          ]
        },
        "scheduleBasedConfig": {
          "defaultCount": "int",
          "schedules": [
            {
              "count": "int",
              "days": [ "string" ],
              "endTime": "string",
              "startTime": "string"
            }
          ],
          "timeZone": "string"
        }
      },
      "clusterAccessProfile": {
        "enableInternalIngress": "bool"
      },
      "clusterVersion": "string",
      "flinkProfile": {
        "catalogOptions": {
          "hive": {
            "metastoreDbConnectionAuthenticationMode": "string",
            "metastoreDbConnectionPasswordSecret": "string",
            "metastoreDbConnectionURL": "string",
            "metastoreDbConnectionUserName": "string"
          }
        },
        "deploymentMode": "string",
        "historyServer": {
          "cpu": "int",
          "memory": "int"
        },
        "jobManager": {
          "cpu": "int",
          "memory": "int"
        },
        "jobSpec": {
          "args": "string",
          "entryClass": "string",
          "jarName": "string",
          "jobJarDirectory": "string",
          "savePointName": "string",
          "upgradeMode": "string"
        },
        "numReplicas": "int",
        "storage": {
          "storagekey": "string",
          "storageUri": "string"
        },
        "taskManager": {
          "cpu": "int",
          "memory": "int"
        }
      },
      "identityProfile": {
        "msiClientId": "string",
        "msiObjectId": "string",
        "msiResourceId": "string"
      },
      "kafkaProfile": {
        "diskStorage": {
          "dataDiskSize": "int",
          "dataDiskType": "string"
        },
        "enableKRaft": "bool",
        "enablePublicEndpoints": "bool",
        "remoteStorageUri": "string"
      },
      "llapProfile": {
        "{customized property}": {}
      },
      "logAnalyticsProfile": {
        "applicationLogs": {
          "stdErrorEnabled": "bool",
          "stdOutEnabled": "bool"
        },
        "enabled": "bool",
        "metricsEnabled": "bool"
      },
      "ossVersion": "string",
      "prometheusProfile": {
        "enabled": "bool"
      },
      "rangerPluginProfile": {
        "enabled": "bool"
      },
      "rangerProfile": {
        "rangerAdmin": {
          "admins": [ "string" ],
          "database": {
            "host": "string",
            "name": "string",
            "passwordSecretRef": "string",
            "username": "string"
          }
        },
        "rangerAudit": {
          "storageAccount": "string"
        },
        "rangerUsersync": {
          "enabled": "bool",
          "groups": [ "string" ],
          "mode": "string",
          "userMappingLocation": "string",
          "users": [ "string" ]
        }
      },
      "scriptActionProfiles": [
        {
          "name": "string",
          "parameters": "string",
          "services": [ "string" ],
          "shouldPersist": "bool",
          "timeoutInMinutes": "int",
          "type": "string",
          "url": "string"
        }
      ],
      "secretsProfile": {
        "keyVaultResourceId": "string",
        "secrets": [
          {
            "keyVaultObjectName": "string",
            "referenceName": "string",
            "type": "string",
            "version": "string"
          }
        ]
      },
      "serviceConfigsProfiles": [
        {
          "configs": [
            {
              "component": "string",
              "files": [
                {
                  "content": "string",
                  "encoding": "string",
                  "fileName": "string",
                  "path": "string",
                  "values": {
                    "{customized property}": "string"
                  }
                }
              ]
            }
          ],
          "serviceName": "string"
        }
      ],
      "sparkProfile": {
        "defaultStorageUrl": "string",
        "metastoreSpec": {
          "dbConnectionAuthenticationMode": "string",
          "dbName": "string",
          "dbPasswordSecretName": "string",
          "dbServerHost": "string",
          "dbUserName": "string",
          "keyVaultId": "string",
          "thriftUrl": "string"
        },
        "userPluginsSpec": {
          "plugins": [
            {
              "path": "string"
            }
          ]
        }
      },
      "sshProfile": {
        "count": "int"
      },
      "stubProfile": {
        "{customized property}": {}
      },
      "trinoProfile": {
        "catalogOptions": {
          "hive": [
            {
              "catalogName": "string",
              "metastoreDbConnectionAuthenticationMode": "string",
              "metastoreDbConnectionPasswordSecret": "string",
              "metastoreDbConnectionURL": "string",
              "metastoreDbConnectionUserName": "string",
              "metastoreWarehouseDir": "string"
            }
          ]
        },
        "coordinator": {
          "debug": {
            "enable": "bool",
            "port": "int",
            "suspend": "bool"
          },
          "highAvailabilityEnabled": "bool"
        },
        "userPluginsSpec": {
          "plugins": [
            {
              "enabled": "bool",
              "name": "string",
              "path": "string"
            }
          ]
        },
        "userTelemetrySpec": {
          "storage": {
            "hivecatalogName": "string",
            "hivecatalogSchema": "string",
            "partitionRetentionInDays": "int",
            "path": "string"
          }
        },
        "worker": {
          "debug": {
            "enable": "bool",
            "port": "int",
            "suspend": "bool"
          }
        }
      }
    },
    "clusterType": "string",
    "computeProfile": {
      "nodes": [
        {
          "count": "int",
          "type": "string",
          "vmSize": "string"
        }
      ]
    }
  },
  "tags": {
    "{customized property}": "string"
  }
}

Property values

AuthorizationProfile

Name Description Value
groupIds AAD group Ids authorized for data plane access. string[]
userIds AAD user Ids authorized for data plane access. string[]

AutoscaleProfile

Name Description Value
autoscaleType User to specify which type of Autoscale to be implemented - Scheduled Based or Load Based. 'LoadBased'
'ScheduleBased'
enabled This indicates whether auto scale is enabled on HDInsight on AKS cluster. bool (required)
gracefulDecommissionTimeout This property is for graceful decommission timeout; It has a default setting of 3600 seconds before forced shutdown takes place. This is the maximal time to wait for running containers and applications to complete before transition a DECOMMISSIONING node into DECOMMISSIONED. The default value is 3600 seconds. Negative value (like -1) is handled as infinite timeout. int
loadBasedConfig Profiles of load based Autoscale. LoadBasedConfig
scheduleBasedConfig Profiles of schedule based Autoscale. ScheduleBasedConfig

CatalogOptions

Name Description Value
hive hive catalog options. HiveCatalogOption[]

ClusterAccessProfile

Name Description Value
enableInternalIngress Whether to create cluster using private IP instead of public IP. This property must be set at create time. bool (required)

ClusterConfigFile

Name Description Value
content Free form content of the entire configuration file. string
encoding This property indicates if the content is encoded and is case-insensitive. Please set the value to base64 if the content is base64 encoded. Set it to none or skip it if the content is plain text. 'Base64'
'None'
fileName Configuration file name. string (required)
path Path of the config file if content is specified. string
values List of key value pairs
where key represents a valid service configuration name and value represents the value of the config.
ClusterConfigFileValues

ClusterConfigFileValues

Name Description Value

ClusterLogAnalyticsApplicationLogs

Name Description Value
stdErrorEnabled True if stderror is enabled, otherwise false. bool
stdOutEnabled True if stdout is enabled, otherwise false. bool

ClusterLogAnalyticsProfile

Name Description Value
applicationLogs Collection of logs to be enabled or disabled for log analytics. ClusterLogAnalyticsApplicationLogs
enabled True if log analytics is enabled for the cluster, otherwise false. bool (required)
metricsEnabled True if metrics are enabled, otherwise false. bool

ClusterProfile

Name Description Value
authorizationProfile Authorization profile with details of AAD user Ids and group Ids authorized for data plane access. AuthorizationProfile (required)
autoscaleProfile This is the Autoscale profile for the cluster. This will allow customer to create cluster enabled with Autoscale. AutoscaleProfile
clusterAccessProfile Cluster access profile. ClusterAccessProfile
clusterVersion Version with 3/4 part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})(?:\.(0|[1-9][0-9]{0,18}))?$ (required)
flinkProfile The Flink cluster profile. FlinkProfile
identityProfile This property is required by Trino, Spark and Flink cluster but is optional for Kafka cluster. IdentityProfile
kafkaProfile The Kafka cluster profile. KafkaProfile
llapProfile LLAP cluster profile. ClusterProfileLlapProfile
logAnalyticsProfile Cluster log analytics profile to enable or disable OMS agent for cluster. ClusterLogAnalyticsProfile
ossVersion Version with three part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})$ (required)
prometheusProfile Cluster Prometheus profile. ClusterPrometheusProfile
rangerPluginProfile Cluster Ranger plugin profile. ClusterRangerPluginProfile
rangerProfile The ranger cluster profile. RangerProfile
scriptActionProfiles The script action profile list. ScriptActionProfile[]
secretsProfile The cluster secret profile. SecretsProfile
serviceConfigsProfiles The service configs profiles. ClusterServiceConfigsProfile[]
sparkProfile The spark cluster profile. SparkProfile
sshProfile Ssh profile for the cluster. SshProfile
stubProfile Stub cluster profile. ClusterProfileStubProfile
trinoProfile Trino Cluster profile. TrinoProfile

ClusterProfileLlapProfile

Name Description Value

ClusterProfileStubProfile

Name Description Value

ClusterPrometheusProfile

Name Description Value
enabled Enable Prometheus for cluster or not. bool (required)

ClusterRangerPluginProfile

Name Description Value
enabled Enable Ranger for cluster or not. bool (required)

ClusterResourceProperties

Name Description Value
clusterProfile Cluster profile. ClusterProfile (required)
clusterType The type of cluster. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9]{0,31}$ (required)
computeProfile The compute profile. ComputeProfile (required)

ClusterServiceConfig

Name Description Value
component Name of the component the config files should apply to. string (required)
files List of Config Files. ClusterConfigFile[] (required)

ClusterServiceConfigsProfile

Name Description Value
configs List of service configs. ClusterServiceConfig[] (required)
serviceName Name of the service the configurations should apply to. string (required)

ComparisonRule

Name Description Value
operator The comparison operator. 'greaterThan'
'greaterThanOrEqual'
'lessThan'
'lessThanOrEqual' (required)
threshold Threshold setting. int (required)

ComputeProfile

Name Description Value
nodes The nodes definitions. NodeProfile[] (required)

ComputeResourceDefinition

Name Description Value
cpu The required CPU. int (required)
memory The required memory in MB, Container memory will be 110 percentile int (required)

DiskStorageProfile

Name Description Value
dataDiskSize Managed Disk size in GB. The maximum supported disk size for Standard and Premium HDD/SSD is 32TB, except for Premium SSD v2, which supports up to 64TB. int (required)
dataDiskType Managed Disk Type. 'Premium_SSD_LRS'
'Premium_SSD_v2_LRS'
'Premium_SSD_ZRS'
'Standard_HDD_LRS'
'Standard_SSD_LRS'
'Standard_SSD_ZRS' (required)

FlinkCatalogOptions

Name Description Value
hive Hive Catalog Option for Flink cluster. FlinkHiveCatalogOption

FlinkHiveCatalogOption

Name Description Value
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string

FlinkJobProfile

Name Description Value
args A string property representing additional JVM arguments for the Flink job. It should be space separated value. string
entryClass A string property that specifies the entry class for the Flink job. If not specified, the entry point is auto-detected from the flink job jar package. string
jarName A string property that represents the name of the job JAR. string (required)
jobJarDirectory A string property that specifies the directory where the job JAR is located. string (required)
savePointName A string property that represents the name of the savepoint for the Flink job string
upgradeMode A string property that indicates the upgrade mode to be performed on the Flink job. It can have one of the following enum values => STATELESS_UPDATE, UPDATE, LAST_STATE_UPDATE. 'LAST_STATE_UPDATE'
'STATELESS_UPDATE'
'UPDATE' (required)

FlinkProfile

Name Description Value
catalogOptions Flink cluster catalog options. FlinkCatalogOptions
deploymentMode A string property that indicates the deployment mode of Flink cluster. It can have one of the following enum values => Application, Session. Default value is Session 'Application'
'Session'
historyServer History Server container/ process CPU and memory requirements ComputeResourceDefinition
jobManager Job Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)
jobSpec Job specifications for flink clusters in application deployment mode. The specification is immutable even if job properties are changed by calling the RunJob API, please use the ListJob API to get the latest job information. FlinkJobProfile
numReplicas The number of task managers. int
storage The storage profile FlinkStorageProfile (required)
taskManager Task Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)

FlinkStorageProfile

Name Description Value
storagekey Storage key is only required for wasb(s) storage. string

Constraints:
Sensitive value. Pass in as a secure parameter.
storageUri Storage account uri which is used for savepoint and checkpoint state. string

Constraints:
Pattern = ^(\w{4,5})://(.*)@(.*).\b(blob|dfs)\b.*$ (required)

HiveCatalogOption

Name Description Value
catalogName Name of trino catalog which should use specified hive metastore. string

Constraints:
Min length = 1 (required)
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string
metastoreWarehouseDir Metastore root directory URI, format: abfs[s]://<container>@<account_name>.dfs.core.windows.net/<path>. More details: /azure/storage/blobs/data-lake-storage-introduction-abfs-uri string (required)

IdentityProfile

Name Description Value
msiClientId ClientId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiObjectId ObjectId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiResourceId ResourceId of the MSI. string (required)

KafkaProfile

Name Description Value
diskStorage Kafka disk storage profile. DiskStorageProfile (required)
enableKRaft Expose Kafka cluster in KRaft mode. bool
enablePublicEndpoints Expose worker nodes as public endpoints. bool
remoteStorageUri Fully qualified path of Azure Storage container used for Tiered Storage. string

Constraints:
Pattern = ^(https?|abfss?):\/\/[^/]+(?:\/|$)

LoadBasedConfig

Name Description Value
cooldownPeriod This is a cool down period, this is a time period in seconds, which determines the amount of time that must elapse between a scaling activity started by a rule and the start of the next scaling activity, regardless of the rule that triggers it. The default value is 300 seconds. int
maxNodes User needs to set the maximum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
minNodes User needs to set the minimum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
pollInterval User can specify the poll interval, this is the time period (in seconds) after which scaling metrics are polled for triggering a scaling operation. int
scalingRules The scaling rules. ScalingRule[] (required)

Microsoft.HDInsight/clusterpools/clusters

Name Description Value
apiVersion The api version '2023-11-01-preview'
location The geo-location where the resource lives string (required)
name The resource name string (required)
properties Gets or sets the properties. Define cluster specific properties. ClusterResourceProperties
tags Resource tags Dictionary of tag names and values. See Tags in templates
type The resource type 'Microsoft.HDInsight/clusterpools/clusters'

NodeProfile

Name Description Value
count The number of virtual machines. int

Constraints:
Min value = 1 (required)
type The node type. string

Constraints:
Pattern = ^(head|Head|HEAD|worker|Worker|WORKER)$ (required)
vmSize The virtual machine SKU. string

Constraints:
Pattern = ^[a-zA-Z0-9_\-]{0,256}$ (required)

RangerAdminSpec

Name Description Value
admins List of usernames that should be marked as ranger admins. These usernames should match the user principal name (UPN) of the respective AAD users. string[] (required)
database RangerAdminSpecDatabase (required)

RangerAdminSpecDatabase

Name Description Value
host The database URL string (required)
name The database name string (required)
passwordSecretRef Reference for the database password string
username The name of the database user string

RangerAuditSpec

Name Description Value
storageAccount Azure storage location of the blobs. MSI should have read/write access to this Storage account. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

RangerProfile

Name Description Value
rangerAdmin Specification for the Ranger Admin service. RangerAdminSpec (required)
rangerAudit Properties required to describe audit log storage. RangerAuditSpec
rangerUsersync Specification for the Ranger Usersync service RangerUsersyncSpec (required)

RangerUsersyncSpec

Name Description Value
enabled Denotes whether usersync service should be enabled bool
groups List of groups that should be synced. These group names should match the object id of the respective AAD groups. string[]
mode User & groups can be synced automatically or via a static list that's refreshed. 'automatic'
'static'
userMappingLocation Azure storage location of a mapping file that lists user & group associations. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$
users List of user names that should be synced. These usernames should match the User principal name of the respective AAD users. string[]

ScalingRule

Name Description Value
actionType The action type. 'scaledown'
'scaleup' (required)
comparisonRule The comparison rule. ComparisonRule (required)
evaluationCount This is an evaluation count for a scaling condition, the number of times a trigger condition should be successful, before scaling activity is triggered. int (required)
scalingMetric Metrics name for individual workloads. For example: cpu string (required)

Schedule

Name Description Value
count User has to set the node count anticipated at end of the scaling operation of the set current schedule configuration, format is integer. int (required)
days User has to set the days where schedule has to be set for autoscale operation. String array containing any of:
'Friday'
'Monday'
'Saturday'
'Sunday'
'Thursday'
'Tuesday'
'Wednesday' (required)
endTime User has to set the end time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)
startTime User has to set the start time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)

ScheduleBasedConfig

Name Description Value
defaultCount Setting default node count of current schedule configuration. Default node count specifies the number of nodes which are default when an specified scaling operation is executed (scale up/scale down) int (required)
schedules This specifies the schedules where scheduled based Autoscale to be enabled, the user has a choice to set multiple rules within the schedule across days and times (start/end). Schedule[] (required)
timeZone User has to specify the timezone on which the schedule has to be set for schedule based autoscale configuration. string (required)

ScriptActionProfile

Name Description Value
name Script name. string (required)
parameters Additional parameters for the script action. It should be space-separated list of arguments required for script execution. string
services List of services to apply the script action. string[] (required)
shouldPersist Specify if the script should persist on the cluster. bool
timeoutInMinutes Timeout duration for the script action in minutes. int
type Type of the script action. Supported type is bash scripts. string (required)
url Url of the script file. string

Constraints:
Pattern = ^(https)|(http)://.*$ (required)

SecretReference

Name Description Value
keyVaultObjectName Object identifier name of the secret in key vault. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9-]{1,126}$ (required)
referenceName Reference name of the secret to be used in service configs. string (required)
type Type of key vault object: secret, key or certificate. 'Certificate'
'Key'
'Secret' (required)
version Version of the secret in key vault. string

SecretsProfile

Name Description Value
keyVaultResourceId Name of the user Key Vault where all the cluster specific user secrets are stored. string (required)
secrets Properties of Key Vault secret. SecretReference[]

SparkMetastoreSpec

Name Description Value
dbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
dbName The database name. string (required)
dbPasswordSecretName The secret name which contains the database user password. string
dbServerHost The database server host. string (required)
dbUserName The database user name. string
keyVaultId The key vault resource id. string
thriftUrl The thrift url. string

SparkProfile

Name Description Value
defaultStorageUrl The default storage URL. string
metastoreSpec The metastore specification for Spark cluster. SparkMetastoreSpec
userPluginsSpec Spark user plugins spec SparkUserPlugins

SparkUserPlugin

Name Description Value
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$ (required)

SparkUserPlugins

Name Description Value
plugins Spark user plugins. SparkUserPlugin[]

SshProfile

Name Description Value
count Number of ssh pods per cluster. int

Constraints:
Min value = 0
Max value = 5 (required)

TrackedResourceTags

Name Description Value

TrinoCoordinator

Name Description Value
debug Trino debug configuration. TrinoDebugConfig
highAvailabilityEnabled The flag that if enable coordinator HA, uses multiple coordinator replicas with auto failover, one per each head node. Default: true. bool

TrinoDebugConfig

Name Description Value
enable The flag that if enable debug or not. bool
port The debug port. int
suspend The flag that if suspend debug or not. bool

TrinoProfile

Name Description Value
catalogOptions Trino cluster catalog options. CatalogOptions
coordinator Trino Coordinator. TrinoCoordinator
userPluginsSpec Trino user plugins spec TrinoUserPlugins
userTelemetrySpec User telemetry TrinoUserTelemetry
worker Trino worker. TrinoWorker

TrinoTelemetryConfig

Name Description Value
hivecatalogName Hive Catalog name used to mount external tables on the logs written by trino, if not specified there tables are not created. string

Constraints:
Min length = 1
hivecatalogSchema Schema of the above catalog to use, to mount query logs as external tables, if not specified tables will be mounted under schema trinologs. string
partitionRetentionInDays Retention period for query log table partitions, this doesn't have any affect on actual data. int
path Azure storage location of the blobs. string

Constraints:
Min length = 1

TrinoUserPlugin

Name Description Value
enabled Denotes whether the plugin is active or not. bool
name This field maps to the sub-directory in trino plugins location, that will contain all the plugins under path. string

Constraints:
Min length = 1
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

TrinoUserPlugins

Name Description Value
plugins Trino user plugins. TrinoUserPlugin[]

TrinoUserTelemetry

Name Description Value
storage Trino user telemetry definition. TrinoTelemetryConfig

TrinoWorker

Name Description Value
debug Trino debug configuration. TrinoDebugConfig

Terraform (AzAPI provider) resource definition

The clusterpools/clusters resource type can be deployed with operations that target:

  • Resource groups

For a list of changed properties in each API version, see change log.

Resource format

To create a Microsoft.HDInsight/clusterpools/clusters resource, add the following Terraform to your template.

resource "azapi_resource" "symbolicname" {
  type = "Microsoft.HDInsight/clusterpools/clusters@2023-11-01-preview"
  name = "string"
  location = "string"
  tags = {
    {customized property} = "string"
  }
  body = jsonencode({
    properties = {
      clusterProfile = {
        authorizationProfile = {
          groupIds = [
            "string"
          ]
          userIds = [
            "string"
          ]
        }
        autoscaleProfile = {
          autoscaleType = "string"
          enabled = bool
          gracefulDecommissionTimeout = int
          loadBasedConfig = {
            cooldownPeriod = int
            maxNodes = int
            minNodes = int
            pollInterval = int
            scalingRules = [
              {
                actionType = "string"
                comparisonRule = {
                  operator = "string"
                  threshold = int
                }
                evaluationCount = int
                scalingMetric = "string"
              }
            ]
          }
          scheduleBasedConfig = {
            defaultCount = int
            schedules = [
              {
                count = int
                days = [
                  "string"
                ]
                endTime = "string"
                startTime = "string"
              }
            ]
            timeZone = "string"
          }
        }
        clusterAccessProfile = {
          enableInternalIngress = bool
        }
        clusterVersion = "string"
        flinkProfile = {
          catalogOptions = {
            hive = {
              metastoreDbConnectionAuthenticationMode = "string"
              metastoreDbConnectionPasswordSecret = "string"
              metastoreDbConnectionURL = "string"
              metastoreDbConnectionUserName = "string"
            }
          }
          deploymentMode = "string"
          historyServer = {
            cpu = int
            memory = int
          }
          jobManager = {
            cpu = int
            memory = int
          }
          jobSpec = {
            args = "string"
            entryClass = "string"
            jarName = "string"
            jobJarDirectory = "string"
            savePointName = "string"
            upgradeMode = "string"
          }
          numReplicas = int
          storage = {
            storagekey = "string"
            storageUri = "string"
          }
          taskManager = {
            cpu = int
            memory = int
          }
        }
        identityProfile = {
          msiClientId = "string"
          msiObjectId = "string"
          msiResourceId = "string"
        }
        kafkaProfile = {
          diskStorage = {
            dataDiskSize = int
            dataDiskType = "string"
          }
          enableKRaft = bool
          enablePublicEndpoints = bool
          remoteStorageUri = "string"
        }
        llapProfile = {
          {customized property} = ?
        }
        logAnalyticsProfile = {
          applicationLogs = {
            stdErrorEnabled = bool
            stdOutEnabled = bool
          }
          enabled = bool
          metricsEnabled = bool
        }
        ossVersion = "string"
        prometheusProfile = {
          enabled = bool
        }
        rangerPluginProfile = {
          enabled = bool
        }
        rangerProfile = {
          rangerAdmin = {
            admins = [
              "string"
            ]
            database = {
              host = "string"
              name = "string"
              passwordSecretRef = "string"
              username = "string"
            }
          }
          rangerAudit = {
            storageAccount = "string"
          }
          rangerUsersync = {
            enabled = bool
            groups = [
              "string"
            ]
            mode = "string"
            userMappingLocation = "string"
            users = [
              "string"
            ]
          }
        }
        scriptActionProfiles = [
          {
            name = "string"
            parameters = "string"
            services = [
              "string"
            ]
            shouldPersist = bool
            timeoutInMinutes = int
            type = "string"
            url = "string"
          }
        ]
        secretsProfile = {
          keyVaultResourceId = "string"
          secrets = [
            {
              keyVaultObjectName = "string"
              referenceName = "string"
              type = "string"
              version = "string"
            }
          ]
        }
        serviceConfigsProfiles = [
          {
            configs = [
              {
                component = "string"
                files = [
                  {
                    content = "string"
                    encoding = "string"
                    fileName = "string"
                    path = "string"
                    values = {
                      {customized property} = "string"
                    }
                  }
                ]
              }
            ]
            serviceName = "string"
          }
        ]
        sparkProfile = {
          defaultStorageUrl = "string"
          metastoreSpec = {
            dbConnectionAuthenticationMode = "string"
            dbName = "string"
            dbPasswordSecretName = "string"
            dbServerHost = "string"
            dbUserName = "string"
            keyVaultId = "string"
            thriftUrl = "string"
          }
          userPluginsSpec = {
            plugins = [
              {
                path = "string"
              }
            ]
          }
        }
        sshProfile = {
          count = int
        }
        stubProfile = {
          {customized property} = ?
        }
        trinoProfile = {
          catalogOptions = {
            hive = [
              {
                catalogName = "string"
                metastoreDbConnectionAuthenticationMode = "string"
                metastoreDbConnectionPasswordSecret = "string"
                metastoreDbConnectionURL = "string"
                metastoreDbConnectionUserName = "string"
                metastoreWarehouseDir = "string"
              }
            ]
          }
          coordinator = {
            debug = {
              enable = bool
              port = int
              suspend = bool
            }
            highAvailabilityEnabled = bool
          }
          userPluginsSpec = {
            plugins = [
              {
                enabled = bool
                name = "string"
                path = "string"
              }
            ]
          }
          userTelemetrySpec = {
            storage = {
              hivecatalogName = "string"
              hivecatalogSchema = "string"
              partitionRetentionInDays = int
              path = "string"
            }
          }
          worker = {
            debug = {
              enable = bool
              port = int
              suspend = bool
            }
          }
        }
      }
      clusterType = "string"
      computeProfile = {
        nodes = [
          {
            count = int
            type = "string"
            vmSize = "string"
          }
        ]
      }
    }
  })
}

Property values

AuthorizationProfile

Name Description Value
groupIds AAD group Ids authorized for data plane access. string[]
userIds AAD user Ids authorized for data plane access. string[]

AutoscaleProfile

Name Description Value
autoscaleType User to specify which type of Autoscale to be implemented - Scheduled Based or Load Based. 'LoadBased'
'ScheduleBased'
enabled This indicates whether auto scale is enabled on HDInsight on AKS cluster. bool (required)
gracefulDecommissionTimeout This property is for graceful decommission timeout; It has a default setting of 3600 seconds before forced shutdown takes place. This is the maximal time to wait for running containers and applications to complete before transition a DECOMMISSIONING node into DECOMMISSIONED. The default value is 3600 seconds. Negative value (like -1) is handled as infinite timeout. int
loadBasedConfig Profiles of load based Autoscale. LoadBasedConfig
scheduleBasedConfig Profiles of schedule based Autoscale. ScheduleBasedConfig

CatalogOptions

Name Description Value
hive hive catalog options. HiveCatalogOption[]

ClusterAccessProfile

Name Description Value
enableInternalIngress Whether to create cluster using private IP instead of public IP. This property must be set at create time. bool (required)

ClusterConfigFile

Name Description Value
content Free form content of the entire configuration file. string
encoding This property indicates if the content is encoded and is case-insensitive. Please set the value to base64 if the content is base64 encoded. Set it to none or skip it if the content is plain text. 'Base64'
'None'
fileName Configuration file name. string (required)
path Path of the config file if content is specified. string
values List of key value pairs
where key represents a valid service configuration name and value represents the value of the config.
ClusterConfigFileValues

ClusterConfigFileValues

Name Description Value

ClusterLogAnalyticsApplicationLogs

Name Description Value
stdErrorEnabled True if stderror is enabled, otherwise false. bool
stdOutEnabled True if stdout is enabled, otherwise false. bool

ClusterLogAnalyticsProfile

Name Description Value
applicationLogs Collection of logs to be enabled or disabled for log analytics. ClusterLogAnalyticsApplicationLogs
enabled True if log analytics is enabled for the cluster, otherwise false. bool (required)
metricsEnabled True if metrics are enabled, otherwise false. bool

ClusterProfile

Name Description Value
authorizationProfile Authorization profile with details of AAD user Ids and group Ids authorized for data plane access. AuthorizationProfile (required)
autoscaleProfile This is the Autoscale profile for the cluster. This will allow customer to create cluster enabled with Autoscale. AutoscaleProfile
clusterAccessProfile Cluster access profile. ClusterAccessProfile
clusterVersion Version with 3/4 part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})(?:\.(0|[1-9][0-9]{0,18}))?$ (required)
flinkProfile The Flink cluster profile. FlinkProfile
identityProfile This property is required by Trino, Spark and Flink cluster but is optional for Kafka cluster. IdentityProfile
kafkaProfile The Kafka cluster profile. KafkaProfile
llapProfile LLAP cluster profile. ClusterProfileLlapProfile
logAnalyticsProfile Cluster log analytics profile to enable or disable OMS agent for cluster. ClusterLogAnalyticsProfile
ossVersion Version with three part. string

Constraints:
Pattern = ^(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})\.(0|[1-9][0-9]{0,18})$ (required)
prometheusProfile Cluster Prometheus profile. ClusterPrometheusProfile
rangerPluginProfile Cluster Ranger plugin profile. ClusterRangerPluginProfile
rangerProfile The ranger cluster profile. RangerProfile
scriptActionProfiles The script action profile list. ScriptActionProfile[]
secretsProfile The cluster secret profile. SecretsProfile
serviceConfigsProfiles The service configs profiles. ClusterServiceConfigsProfile[]
sparkProfile The spark cluster profile. SparkProfile
sshProfile Ssh profile for the cluster. SshProfile
stubProfile Stub cluster profile. ClusterProfileStubProfile
trinoProfile Trino Cluster profile. TrinoProfile

ClusterProfileLlapProfile

Name Description Value

ClusterProfileStubProfile

Name Description Value

ClusterPrometheusProfile

Name Description Value
enabled Enable Prometheus for cluster or not. bool (required)

ClusterRangerPluginProfile

Name Description Value
enabled Enable Ranger for cluster or not. bool (required)

ClusterResourceProperties

Name Description Value
clusterProfile Cluster profile. ClusterProfile (required)
clusterType The type of cluster. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9]{0,31}$ (required)
computeProfile The compute profile. ComputeProfile (required)

ClusterServiceConfig

Name Description Value
component Name of the component the config files should apply to. string (required)
files List of Config Files. ClusterConfigFile[] (required)

ClusterServiceConfigsProfile

Name Description Value
configs List of service configs. ClusterServiceConfig[] (required)
serviceName Name of the service the configurations should apply to. string (required)

ComparisonRule

Name Description Value
operator The comparison operator. 'greaterThan'
'greaterThanOrEqual'
'lessThan'
'lessThanOrEqual' (required)
threshold Threshold setting. int (required)

ComputeProfile

Name Description Value
nodes The nodes definitions. NodeProfile[] (required)

ComputeResourceDefinition

Name Description Value
cpu The required CPU. int (required)
memory The required memory in MB, Container memory will be 110 percentile int (required)

DiskStorageProfile

Name Description Value
dataDiskSize Managed Disk size in GB. The maximum supported disk size for Standard and Premium HDD/SSD is 32TB, except for Premium SSD v2, which supports up to 64TB. int (required)
dataDiskType Managed Disk Type. 'Premium_SSD_LRS'
'Premium_SSD_v2_LRS'
'Premium_SSD_ZRS'
'Standard_HDD_LRS'
'Standard_SSD_LRS'
'Standard_SSD_ZRS' (required)

FlinkCatalogOptions

Name Description Value
hive Hive Catalog Option for Flink cluster. FlinkHiveCatalogOption

FlinkHiveCatalogOption

Name Description Value
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string

FlinkJobProfile

Name Description Value
args A string property representing additional JVM arguments for the Flink job. It should be space separated value. string
entryClass A string property that specifies the entry class for the Flink job. If not specified, the entry point is auto-detected from the flink job jar package. string
jarName A string property that represents the name of the job JAR. string (required)
jobJarDirectory A string property that specifies the directory where the job JAR is located. string (required)
savePointName A string property that represents the name of the savepoint for the Flink job string
upgradeMode A string property that indicates the upgrade mode to be performed on the Flink job. It can have one of the following enum values => STATELESS_UPDATE, UPDATE, LAST_STATE_UPDATE. 'LAST_STATE_UPDATE'
'STATELESS_UPDATE'
'UPDATE' (required)

FlinkProfile

Name Description Value
catalogOptions Flink cluster catalog options. FlinkCatalogOptions
deploymentMode A string property that indicates the deployment mode of Flink cluster. It can have one of the following enum values => Application, Session. Default value is Session 'Application'
'Session'
historyServer History Server container/ process CPU and memory requirements ComputeResourceDefinition
jobManager Job Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)
jobSpec Job specifications for flink clusters in application deployment mode. The specification is immutable even if job properties are changed by calling the RunJob API, please use the ListJob API to get the latest job information. FlinkJobProfile
numReplicas The number of task managers. int
storage The storage profile FlinkStorageProfile (required)
taskManager Task Manager container/ process CPU and memory requirements ComputeResourceDefinition (required)

FlinkStorageProfile

Name Description Value
storagekey Storage key is only required for wasb(s) storage. string

Constraints:
Sensitive value. Pass in as a secure parameter.
storageUri Storage account uri which is used for savepoint and checkpoint state. string

Constraints:
Pattern = ^(\w{4,5})://(.*)@(.*).\b(blob|dfs)\b.*$ (required)

HiveCatalogOption

Name Description Value
catalogName Name of trino catalog which should use specified hive metastore. string

Constraints:
Min length = 1 (required)
metastoreDbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
metastoreDbConnectionPasswordSecret Secret reference name from secretsProfile.secrets containing password for database connection. string
metastoreDbConnectionURL Connection string for hive metastore database. string (required)
metastoreDbConnectionUserName User name for database connection. string
metastoreWarehouseDir Metastore root directory URI, format: abfs[s]://<container>@<account_name>.dfs.core.windows.net/<path>. More details: /azure/storage/blobs/data-lake-storage-introduction-abfs-uri string (required)

IdentityProfile

Name Description Value
msiClientId ClientId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiObjectId ObjectId of the MSI. string

Constraints:
Pattern = ^[{(]?[0-9A-Fa-f]{8}[-]?(?:[0-9A-Fa-f]{4}[-]?){3}[0-9A-Fa-f]{12}[)}]?$ (required)
msiResourceId ResourceId of the MSI. string (required)

KafkaProfile

Name Description Value
diskStorage Kafka disk storage profile. DiskStorageProfile (required)
enableKRaft Expose Kafka cluster in KRaft mode. bool
enablePublicEndpoints Expose worker nodes as public endpoints. bool
remoteStorageUri Fully qualified path of Azure Storage container used for Tiered Storage. string

Constraints:
Pattern = ^(https?|abfss?):\/\/[^/]+(?:\/|$)

LoadBasedConfig

Name Description Value
cooldownPeriod This is a cool down period, this is a time period in seconds, which determines the amount of time that must elapse between a scaling activity started by a rule and the start of the next scaling activity, regardless of the rule that triggers it. The default value is 300 seconds. int
maxNodes User needs to set the maximum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
minNodes User needs to set the minimum number of nodes for load based scaling, the load based scaling will use this to scale up and scale down between minimum and maximum number of nodes. int (required)
pollInterval User can specify the poll interval, this is the time period (in seconds) after which scaling metrics are polled for triggering a scaling operation. int
scalingRules The scaling rules. ScalingRule[] (required)

Microsoft.HDInsight/clusterpools/clusters

Name Description Value
location The geo-location where the resource lives string (required)
name The resource name string (required)
parent_id The ID of the resource that is the parent for this resource. ID for resource of type: clusterpools
properties Gets or sets the properties. Define cluster specific properties. ClusterResourceProperties
tags Resource tags Dictionary of tag names and values.
type The resource type "Microsoft.HDInsight/clusterpools/clusters@2023-11-01-preview"

NodeProfile

Name Description Value
count The number of virtual machines. int

Constraints:
Min value = 1 (required)
type The node type. string

Constraints:
Pattern = ^(head|Head|HEAD|worker|Worker|WORKER)$ (required)
vmSize The virtual machine SKU. string

Constraints:
Pattern = ^[a-zA-Z0-9_\-]{0,256}$ (required)

RangerAdminSpec

Name Description Value
admins List of usernames that should be marked as ranger admins. These usernames should match the user principal name (UPN) of the respective AAD users. string[] (required)
database RangerAdminSpecDatabase (required)

RangerAdminSpecDatabase

Name Description Value
host The database URL string (required)
name The database name string (required)
passwordSecretRef Reference for the database password string
username The name of the database user string

RangerAuditSpec

Name Description Value
storageAccount Azure storage location of the blobs. MSI should have read/write access to this Storage account. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

RangerProfile

Name Description Value
rangerAdmin Specification for the Ranger Admin service. RangerAdminSpec (required)
rangerAudit Properties required to describe audit log storage. RangerAuditSpec
rangerUsersync Specification for the Ranger Usersync service RangerUsersyncSpec (required)

RangerUsersyncSpec

Name Description Value
enabled Denotes whether usersync service should be enabled bool
groups List of groups that should be synced. These group names should match the object id of the respective AAD groups. string[]
mode User & groups can be synced automatically or via a static list that's refreshed. 'automatic'
'static'
userMappingLocation Azure storage location of a mapping file that lists user & group associations. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$
users List of user names that should be synced. These usernames should match the User principal name of the respective AAD users. string[]

ScalingRule

Name Description Value
actionType The action type. 'scaledown'
'scaleup' (required)
comparisonRule The comparison rule. ComparisonRule (required)
evaluationCount This is an evaluation count for a scaling condition, the number of times a trigger condition should be successful, before scaling activity is triggered. int (required)
scalingMetric Metrics name for individual workloads. For example: cpu string (required)

Schedule

Name Description Value
count User has to set the node count anticipated at end of the scaling operation of the set current schedule configuration, format is integer. int (required)
days User has to set the days where schedule has to be set for autoscale operation. String array containing any of:
'Friday'
'Monday'
'Saturday'
'Sunday'
'Thursday'
'Tuesday'
'Wednesday' (required)
endTime User has to set the end time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)
startTime User has to set the start time of current schedule configuration, format like 10:30 (HH:MM). string

Constraints:
Pattern = ^([0-1]?[0-9]|2[0-3]):[0-5][0-9]$ (required)

ScheduleBasedConfig

Name Description Value
defaultCount Setting default node count of current schedule configuration. Default node count specifies the number of nodes which are default when an specified scaling operation is executed (scale up/scale down) int (required)
schedules This specifies the schedules where scheduled based Autoscale to be enabled, the user has a choice to set multiple rules within the schedule across days and times (start/end). Schedule[] (required)
timeZone User has to specify the timezone on which the schedule has to be set for schedule based autoscale configuration. string (required)

ScriptActionProfile

Name Description Value
name Script name. string (required)
parameters Additional parameters for the script action. It should be space-separated list of arguments required for script execution. string
services List of services to apply the script action. string[] (required)
shouldPersist Specify if the script should persist on the cluster. bool
timeoutInMinutes Timeout duration for the script action in minutes. int
type Type of the script action. Supported type is bash scripts. string (required)
url Url of the script file. string

Constraints:
Pattern = ^(https)|(http)://.*$ (required)

SecretReference

Name Description Value
keyVaultObjectName Object identifier name of the secret in key vault. string

Constraints:
Pattern = ^[a-zA-Z][a-zA-Z0-9-]{1,126}$ (required)
referenceName Reference name of the secret to be used in service configs. string (required)
type Type of key vault object: secret, key or certificate. 'Certificate'
'Key'
'Secret' (required)
version Version of the secret in key vault. string

SecretsProfile

Name Description Value
keyVaultResourceId Name of the user Key Vault where all the cluster specific user secrets are stored. string (required)
secrets Properties of Key Vault secret. SecretReference[]

SparkMetastoreSpec

Name Description Value
dbConnectionAuthenticationMode The authentication mode to connect to your Hive metastore database. More details: /azure/azure-sql/database/logins-create-manage?view=azuresql#authentication-and-authorization 'IdentityAuth'
'SqlAuth'
dbName The database name. string (required)
dbPasswordSecretName The secret name which contains the database user password. string
dbServerHost The database server host. string (required)
dbUserName The database user name. string
keyVaultId The key vault resource id. string
thriftUrl The thrift url. string

SparkProfile

Name Description Value
defaultStorageUrl The default storage URL. string
metastoreSpec The metastore specification for Spark cluster. SparkMetastoreSpec
userPluginsSpec Spark user plugins spec SparkUserPlugins

SparkUserPlugin

Name Description Value
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$ (required)

SparkUserPlugins

Name Description Value
plugins Spark user plugins. SparkUserPlugin[]

SshProfile

Name Description Value
count Number of ssh pods per cluster. int

Constraints:
Min value = 0
Max value = 5 (required)

TrackedResourceTags

Name Description Value

TrinoCoordinator

Name Description Value
debug Trino debug configuration. TrinoDebugConfig
highAvailabilityEnabled The flag that if enable coordinator HA, uses multiple coordinator replicas with auto failover, one per each head node. Default: true. bool

TrinoDebugConfig

Name Description Value
enable The flag that if enable debug or not. bool
port The debug port. int
suspend The flag that if suspend debug or not. bool

TrinoProfile

Name Description Value
catalogOptions Trino cluster catalog options. CatalogOptions
coordinator Trino Coordinator. TrinoCoordinator
userPluginsSpec Trino user plugins spec TrinoUserPlugins
userTelemetrySpec User telemetry TrinoUserTelemetry
worker Trino worker. TrinoWorker

TrinoTelemetryConfig

Name Description Value
hivecatalogName Hive Catalog name used to mount external tables on the logs written by trino, if not specified there tables are not created. string

Constraints:
Min length = 1
hivecatalogSchema Schema of the above catalog to use, to mount query logs as external tables, if not specified tables will be mounted under schema trinologs. string
partitionRetentionInDays Retention period for query log table partitions, this doesn't have any affect on actual data. int
path Azure storage location of the blobs. string

Constraints:
Min length = 1

TrinoUserPlugin

Name Description Value
enabled Denotes whether the plugin is active or not. bool
name This field maps to the sub-directory in trino plugins location, that will contain all the plugins under path. string

Constraints:
Min length = 1
path Fully qualified path to the folder containing the plugins. string

Constraints:
Min length = 1
Pattern = ^(https)|(abfss)://.*$

TrinoUserPlugins

Name Description Value
plugins Trino user plugins. TrinoUserPlugin[]

TrinoUserTelemetry

Name Description Value
storage Trino user telemetry definition. TrinoTelemetryConfig

TrinoWorker

Name Description Value
debug Trino debug configuration. TrinoDebugConfig