Tutorial: Migrate MongoDB to Azure Cosmos DB for MongoDB RU online using Azure Database Migration Service
APPLIES TO: MongoDB
Important
Migrations to Azure Cosmos DB for MongoDB vCore can be performed using the MongoDB migration extension for Azure Data Studio. This extension leverages the Azure Database Migration Service in the background. Note that migrations to Azure Cosmos DB for MongoDB vCore cannot be done using the Database Migration Service on the Azure portal.
This MongoDB migration guide is part of series on MongoDB migration. The critical MongoDB migration steps are pre-migration, migration, and post-migration.
Overview of online data migration from MongoDB to Azure Cosmos DB using DMS
You can use Azure Database Migration Service to perform an online (minimal downtime) migration of databases from an on-premises or cloud instance of MongoDB to Azure Cosmos DB for MongoDB.
This tutorial demonstrates the steps associated with using Azure Database Migration Service to migrate MongoDB data to Azure Cosmos DB:
- Create an instance of Azure Database Migration Service.
- Create a migration project.
- Specify the source.
- Specify the target.
- Map to target databases.
- Run the migration.
- Monitor the migration.
- Verify data in Azure Cosmos DB.
- Complete the migration when you're ready.
In this tutorial, you migrate a dataset in MongoDB hosted in an Azure virtual machine to Azure Cosmos DB for MongoDB with minimal downtime via Azure Database Migration Service. If you don't have a MongoDB source set up already, see Install and configure MongoDB on a Windows VM in Azure.
Using Azure Database Migration Service to perform an online migration requires creating an instance based on the Premium pricing tier.
Important
For an optimal migration experience, Microsoft recommends creating an instance of Azure Database Migration Service in the same Azure region as the target database. Moving data across regions or geographies can slow down the migration process.
Tip
In Azure Database Migration Service, you can migrate your databases offline or while they are online. In an offline migration, application downtime starts when the migration starts. To limit downtime to the time it takes you to cut over to the new environment after the migration, use an online migration. We recommend that you test an offline migration to determine whether the downtime is acceptable. If the expected downtime isn't acceptable, do an online migration.
This article describes an online migration from MongoDB to Azure Cosmos DB for MongoDB. For an offline migration, see Tutorial: Migrate MongoDB to Azure Cosmos DB for MongoDB RU offline using Azure Database Migration Service.
Prerequisites
To complete this tutorial, you need to:
Complete the pre-migration steps such as estimating throughput, choosing a partition key, and the indexing policy.
Create an Azure Cosmos DB for MongoDB account and ensure Prevent rate-limiting errors for Azure Cosmos DB for MongoDB operations is enabled.
Note
DMS is currently not supported if you're migrating to an Azure Cosmos DB for MongoDB account provisioned with serverless mode.
Create a Microsoft Azure Virtual Network for Azure Database Migration Service by using Azure Resource Manager deployment model, which provides site-to-site connectivity to your on-premises source servers by using either ExpressRoute or VPN.
During virtual network setup, if you use ExpressRoute with network peering to Microsoft, add the following service endpoints to the subnet in which the service will be provisioned:
- Target database endpoint (for example, SQL endpoint, Azure Cosmos DB endpoint, and so on)
- Storage endpoint
- Service bus endpoint
This configuration is necessary because Azure Database Migration Service lacks internet connectivity.
Ensure that your virtual network Network Security Group (NSG) rules don't block the following communication ports: 53, 443, 445, 9354, and 10000-20000. For more detail on virtual network NSG traffic filtering, see the article Filter network traffic with network security groups.
Open your Windows Firewall to allow Azure Database Migration Service to access the source MongoDB server, which by default is TCP port 27017.
When using a firewall appliance in front of your source databases, you might need to add firewall rules to allow Azure Database Migration Service to access the source databases for migration.
Configure Azure Cosmos DB Server Side Retries for efficient migration
Customers migrating from MongoDB to Azure Cosmos DB benefit from resource governance capabilities, which guarantee the ability to fully utilize your provisioned RU/s of throughput. Azure Cosmos DB might throttle a given Data Migration Service request in the course of migration if that request exceeds the container provisioned RU/s; then that request needs to be retried. Data Migration Service is capable of performing retries, however the round-trip time involved in the network hop between Data Migration Service and Azure Cosmos DB impacts the overall response time of that request. Improving response time for throttled requests can shorten the total time needed for migration. The Server Side Retry feature of Azure Cosmos DB allows the service to intercept throttle error codes and retry with lower round-trip time, dramatically improving request response times.
You can find the Server Side Retry capability in the Features blade of the Azure Cosmos DB portal
And if it's Disabled, then we recommend you enable it as shown below
Register the resource provider
Register the Microsoft.DataMigration resource provider before you create your first instance of the Database Migration Service.
Sign in to the Azure portal. Search for and select Subscriptions.
Select the subscription in which you want to create the instance of Azure Database Migration Service, and then select Resource providers.
Search for migration, and then select Register for Microsoft.DataMigration.
Create an instance
In the Azure portal, select + Create a resource, search for Azure Database Migration Service, and then select Azure Database Migration Service from the dropdown list.
On the Azure Database Migration Service screen, select Create.
On the Create Migration Service screen, specify a name for the service, the subscription, and a new or existing resource group.
Select the location in which you want to create the instance of Azure Database Migration Service.
Select an existing virtual network, or create a new one.
The virtual network provides Azure Database Migration Service with access to the source MongoDB instance and the target Azure Cosmos DB account.
For more information about how to create a virtual network in the Azure portal, see the article Create a virtual network using the Azure portal.
Select a SKU from the Premium pricing tier.
Note
Online migrations are supported only when using the Premium tier. For more information on costs and pricing tiers, see the pricing page.
Select Create to create the service.
Create a migration project
After the service is created, locate it within the Azure portal, open it, and then create a new migration project.
In the Azure portal, select All services, search for Azure Database Migration Service, and then select Azure Database Migration Services.
On the Azure Database Migration Services screen, search for the name of Azure Database Migration Service instance that you created, and then select the instance.
Alternately, you can discover Azure Database Migration service instance from the search pane in Azure portal.
Select + New Migration Project.
On the New migration project screen, specify a name for the project, in the Source server type text box, select MongoDB, in the Target server type text box, select Azure Cosmos DB for MongoDB, and then for Choose type of activity, select Online data migration [preview].
Select Save, and then select Create and run activity to create the project and run the migration activity.
Specify source details
On the Source details screen, specify the connection details for the source MongoDB server.
Important
Azure Database Migration Service doesn't support Azure Cosmos DB as a source.
There are three modes to connect to a source:
Standard mode, which accepts a fully qualified domain name or an IP address, Port number, and connection credentials.
Connection string mode, which accepts a MongoDB Connection string as described in the article Connection String URI Format.
Data from Azure storage, which accepts a blob container SAS URL. Select Blob contains BSON dumps if the blob container has BSON dumps produced by the MongoDB bsondump tool, and de-select it if the container contains JSON files.
If you select this option, be sure that the storage account connection string appears in the format:
https://blobnameurl/container?SASKEY
Also, based on the type dump information in Azure Storage, keep the following detail in mind.
For BSON dumps, the data within the blob container must be in bsondump format, such that data files are placed into folders named after the containing databases in the format collection.bson. Metadata files (if any) should be named using the format collection.metadata.json.
For JSON dumps, the files in the blob container must be placed into folders named after the containing databases. Within each database folder, data files must be placed in a subfolder called "data" and named using the format collection.json. Metadata files (if any) must be placed in a subfolder called "metadata" and named using the same format, collection.json. The metadata files must be in the same format as produced by the MongoDB bsondump tool.
Important
It's discouraged to use a self-signed certificate on the MongoDB server. However, if one is used, please connect to the server using connection string mode and ensure that your connection string has ""
&sslVerifyCertificate=false
You can use the IP Address for situations in which DNS name resolution isn't possible.
Select Save.
The Source server address should be the address of the primary if the source is a replica set, and the router if the source is a sharded MongoDB cluster. For a sharded MongoDB cluster, Azure Database Migration Service must be able to connect to the individual shards in the cluster, which might require opening the firewall on more machines.
Specify target details
On the Migration target details screen, specify the connection details for the target Azure Cosmos DB account, which is the pre-provisioned Azure Cosmos DB for the MongoDB account to which you're migrating your MongoDB data.
Select Save.
Map to target databases
On the Map to target databases screen, map the source and the target database for migration.
If the target database contains the same database name as the source database, Azure Database Migration Service selects the target database by default.
If the string Create appears next to the database name, it indicates that Azure Database Migration Service didn't find the target database, and the service will create the database for you.
At this point in the migration, if you want share throughput on the database, specify a throughput RU. In Azure Cosmos DB, you can provision throughput either at the database-level or individually for each collection. Throughput is measured in Request Units in Azure Cosmos DB (RUs). Learn more about Azure Cosmos DB pricing.
Select Save.
On the Collection setting screen, expand the collections listing, and then review the list of collections that will be migrated.
Azure Database Migration Service auto selects all the collections that exist on the source MongoDB instance that don't exist on the target Azure Cosmos DB account. If you want to remigrate collections that already include data, you need to explicitly select the collections on this screen.
You can specify the number of RUs that you want the collections to use. In most cases, a value between 500 (1000 minimum for sharded collections) and 4000 should suffice. Azure Database Migration Service suggests smart defaults based on the collection size.
Perform the database migration and collection in parallel using multiple instances of Azure Database Migration Service, if necessary, to speed up the run.
You can also specify a shard key to take advantage of partitioning in Azure Cosmos DB for optimal scalability. Be sure to review the best practices for selecting a shard/partition key. If you don't have a partition key, you can always use _id as the shard key for better throughput.
Select Save.
On the Migration summary screen, in the Activity name text box, specify a name for the migration activity.
Run the migration
Select Run migration.
The migration activity window appears, and the Status of the activity is displayed.
Monitor the migration
On the migration activity screen, select Refresh to update the display until the Status of the migration shows as Replaying.
Note
You can select the Activity to get details of database- and collection-level migration metrics.
Verify data in Azure Cosmos DB
Make changes to your source MongoDB database.
Connect to Azure Cosmos DB to verify if the data is replicated from the source MongoDB server.
Complete the migration
After all documents from the source are available on the Azure Cosmos DB target, select Finish from the migration activity's context menu to complete the migration.
This action will finish replaying all the pending changes and complete the migration.
Post-migration optimization
After you migrate the data stored in MongoDB database to Azure Cosmos DB for MongoDB, you can connect to Azure Cosmos DB and manage the data. You can also perform other post-migration optimization steps such as optimizing the indexing policy, update the default consistency level, or configure global distribution for your Azure Cosmos DB account. For more information, see the Post-migration optimization article.
Additional resources
Trying to do capacity planning for a migration to Azure Cosmos DB?
- If all you know is the number of vCores and servers in your existing database cluster, read about estimating request units using vCores or vCPUs
- If you know typical request rates for your current database workload, read about estimating request units using Azure Cosmos DB capacity planner