Quickstart: Create an Azure Managed Instance for Apache Cassandra cluster using Azure CLI
Azure Managed Instance for Apache Cassandra is a fully managed service for pure open-source Apache Cassandra clusters. The service also allows configurations to be overridden, depending on the specific needs of each workload, allowing maximum flexibility and control where needed.
This quickstart demonstrates how to use the Azure command line interface (CLI) commands to create a cluster with Azure Managed Instance for Apache Cassandra. It also shows how to create a datacenter, and scale nodes up or down within the datacenter.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Quickstart for Bash in Azure Cloud Shell.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you're running on Windows or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish the authentication process, follow the steps displayed in your terminal. For other sign-in options, see Sign in with the Azure CLI.
When you're prompted, install the Azure CLI extension on first use. For more information about extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the latest version, run az upgrade.
Azure Virtual Network with connectivity to your self-hosted or on-premises environment. For more information on connecting on premises environments to Azure, see the Connect an on-premises network to Azure article.
If you don't have an Azure subscription, create a free account before you begin.
Important
This article requires the Azure CLI version 2.30.0 or higher. If you are using Azure Cloud Shell, the latest version is already installed.
Create a managed instance cluster
Sign in to the Azure portal
Set your subscription ID in Azure CLI:
az account set -s <Subscription_ID>
Next, create a Virtual Network with a dedicated subnet in your resource group:
az network vnet create -n <VNet_Name> -l eastus2 -g <Resource_Group_Name> --subnet-name <Subnet Name>
Note
The Deployment of an Azure Managed Instance for Apache Cassandra requires internet access. Deployment fails in environments where internet access is restricted. Make sure you are not blocking access within your virtual network (VNet) to the following Azure services that are required for Managed Cassandra to work properly:
- Azure Storage
- Azure KeyVault
- Azure Virtual Machine Scale Sets (VMSS)
- Azure Monitoring
- Microsoft Entra ID
- Azure Security
Apply these specific permissions to the Virtual Network. The managed instance requires them. Use the
az role assignment create
command, replacing<subscriptionID>
,<resourceGroupName>
, and<vnetName>
with the appropriate values:az role assignment create \ --assignee a232010e-820c-4083-83bb-3ace5fc29d0b \ --role 4d97b98b-1d4f-4787-a291-c67834d212e7 \ --scope /subscriptions/<subscriptionID>/resourceGroups/<resourceGroupName>/providers/Microsoft.Network/virtualNetworks/<vnetName>
Note
The
assignee
androle
values are fixed values. Enter these values exactly as mentioned in the command. Not doing so leads to errors when creating the cluster. If you encounter any errors when executing this command, you may not have permissions to run it, reach out to your Azure admin for permissions.Next create the cluster in your newly created Virtual Network by using the az managed-cassandra cluster create command. Run the following command the value of
delegatedManagementSubnetId
variable:Note
The value of the
delegatedManagementSubnetId
is the same VNet name that the permissions were applied.resourceGroupName='<Resource_Group_Name>' clusterName='<Cluster_Name>' location='eastus2' delegatedManagementSubnetId='/subscriptions/<subscription ID>/resourceGroups/<resource group name>/providers/Microsoft.Network/virtualNetworks/<VNet name>/subnets/<subnet name>' initialCassandraAdminPassword='myPassword' cassandraVersion='3.11' # set to 4.0 for a Cassandra 4.0 cluster az managed-cassandra cluster create \ --cluster-name $clusterName \ --resource-group $resourceGroupName \ --location $location \ --delegated-management-subnet-id $delegatedManagementSubnetId \ --initial-cassandra-admin-password $initialCassandraAdminPassword \ --cassandra-version $cassandraVersion \ --debug
Create a datacenter for the cluster, with three virtual machines using the following configuration:
VM Size: Standard E8s v5
Datadisks: 4 P30 disks attached to each of the virtual machines deployed.
With all in place, use the az managed-cassandra datacenter create command:
dataCenterName='dc1' dataCenterLocation='eastus2' virtualMachineSKU='Standard_D8s_v4' noOfDisksPerNode=4 az managed-cassandra datacenter create \ --resource-group $resourceGroupName \ --cluster-name $clusterName \ --data-center-name $dataCenterName \ --data-center-location $dataCenterLocation \ --delegated-subnet-id $delegatedManagementSubnetId \ --node-count 3 \ --sku $virtualMachineSKU \ --disk-capacity $noOfDisksPerNode \ --availability-zone false
Note
The value for
--sku
can be chosen from the following available VM sizes:- Standard_E8s_v5
- Standard_E16s_v5
- Standard_E20s_v5
- Standard_E32s_v5
By default
--availability-zone
is set tofalse
. To enable availability zones, set it totrue
. Availability zones help increasing the availability of the service. For more details, review the full service-level agreement (SLA) details here.Warning
Availability zones are not supported in all Azure regions. Deployments fail if you select a region where Availability zones are not supported. See here for supported regions. The successful deployment of availability zones is subject to the availability of compute resources in all of the zones in the region selected. Deployments fail if the virtual machine size you choose is not available in the region selected.
Once the datacenter is created, you can run the az managed-cassandra datacenter update command to scale down or up your cluster. Change the value of
node-count
parameter to the desired value:resourceGroupName='<Resource_Group_Name>' clusterName='<Cluster Name>' dataCenterName='dc1' dataCenterLocation='eastus2' az managed-cassandra datacenter update \ --resource-group $resourceGroupName \ --cluster-name $clusterName \ --data-center-name $dataCenterName \ --node-count 9
Connect to your cluster
Azure Managed Instance for Apache Cassandra does not create nodes with public IP addresses. To connect to your new Cassandra cluster, you must create another resource inside the same virtual network. This resource can be an application, or a virtual machine (VM) with Apache's open-source query tool CQLSH installed.
You can use a Resource Manager template to deploy an Ubuntu virtual machine.
Note
Due to some known issues with versions of Python, the recommendation is to use an Ubuntu 22.04 image which comes with Python3.10.12 or use a Python virtual environment to run CQLSH.
Connecting from CQLSH
After the virtual machine is deployed, use SSH to connect to the machine and install CQLSH as shown in the following commands:
# Install default-jre and default-jdk
sudo apt update
sudo apt install openjdk-8-jdk openjdk-8-jre
Check which versions of Cassandra are still supported and pick the version you need. Stable versions are recommended.
Install the Cassandra libraries in order to get CQLSH by following the official steps from the Cassandra documentation
Connect by simply using cqlsh, as described in the documentation.
Connecting from an application
As with CQLSH, connecting from an application using one of the supported Apache Cassandra client drivers requires SSL encryption to be enabled, and certification verification to be disabled. See samples for connecting to Azure Managed Instance for Apache Cassandra using Java, .NET, Node.js and Python.
Disabling certificate verification is recommended because certificate verification does not work unless you map IP addresses of your cluster nodes to the appropriate domain. If you have an internal policy which mandates that you do SSL certificate verification for any application, you can facilitate by adding entries like 10.0.1.5 host1.managedcassandra.cosmos.azure.com
in your hosts file for each node. If taking this approach, you would also need to add new entries anytime you scale up nodes.
For Java, we highly recommend enabling speculative execution policy where applications are sensitive to tail latency. You can find a demo illustrating how this works and how to enable the policy here.
Note
Configuring certificates, rootCA, nodes, clients, or truststores is generally unnecessary for connecting to Azure Managed Instance for Apache Cassandra. SSL encryption uses the default truststore and the client's chosen runtime password (see sample code for Java, .NET, Node.js, and Python). Certificates are trusted by default; if not, add them to the truststore.
Configuring client certificates (optional)
Configuring client certificates is optional. A client application can connect to Azure Managed Instance for Apache Cassandra as long as the above steps are followed. If preferred, you can also create and configure client certificates for authentication. In general, there are two ways of creating certificates:
Self-signed certificates: These involve private and public certificates (no CA) for each node. In this case, all public certificates are required.
Certificates signed by a CA: These can be issued by a self-signed CA or a public CA. For this setup, you need the root CA certificate (see instructions on preparing SSL certificates for production) and all intermediary certificates (if applicable).
To implement client-to-node certificate authentication or mutual Transport Layer Security (mTLS), provide the certificates via Azure CLI. The following command uploads and applies your client certificates to the truststore for your Cassandra Managed Instance cluster (no need to modify cassandra.yaml
settings). Once applied, the cluster will require Cassandra to verify certificates during client connections (see require_client_auth: true
in Cassandra client_encryption_options).
resourceGroupName='<Resource_Group_Name>'
clusterName='<Cluster Name>'
az managed-cassandra cluster update \
--resource-group $resourceGroupName \
--cluster-name $clusterName \
--client-certificates /usr/csuser/clouddrive/rootCert.pem /usr/csuser/clouddrive/intermediateCert.pem
Troubleshooting
If you encounter an error when applying permissions to your Virtual Network using Azure CLI, such as Cannot find user or service principal in graph database for 'e5007d2c-4b13-4a74-9b6a-605d99f03501', you can apply the same permission manually from the Azure portal. Learn how to do this here.
Note
The Azure Cosmos DB role assignment is used for deployment purposes only. Azure Managed Instanced for Apache Cassandra has no backend dependencies on Azure Cosmos DB.
Clean up resources
When no longer needed, you can use the az group delete
command to remove the resource group, the managed instance, and all related resources:
az group delete --name <Resource_Group_Name>
Next steps
In this quickstart, you learned how to create an Azure Managed Instance for Apache Cassandra cluster using Azure CLI. You can now start working with the cluster: