Explore Azure Cosmos DB - document API
1 Introduction
This article provides an introduction to Azure Cosmos DB - Document API with MongoDB. It mainly focuses on NoSQL - non-relational databases. Let's explore MongoDB - document database and finally move into Microsoft's latest web scale database innovation which is Azure Cosmos DB.
2 What is NoSQL
NoSQL is the next generation database, In the world of databases, we have SQL & NoSQL databases, relational & non-relational, NoSQL means "not only SQL".
The difference between these two types of database technologies is, what type of data they store, how they store it and the way we use it. Relational databases are structured like a phone book that stores phone numbers. Non-relational databases are document based & distributed like folders that hold everything from phone numbers to images.
NoSQL databases are non-relational, A non-relational data model doesn't have a table model, Instead, data can be stored in a single document file. In a relational database, data is organized into tables and each table has columns and data rows.
Relational databases are dominated the software industry for a long time, But now non-relational databases are taking its place.
An important feature of NoSQL database is polyglot persistence, polyglot persistence means. we can use more than one storage mechanism to accomplish application storage needs.
NoSQL databases are scalable, they can scale up to thousands and millions of users.
Most of the NoSQL databases are open source, You can just download them and setup a test environment.
NoSQL databases are schema free, In NoSQL database technology, no predefined structure, no schema
It has easy replication support, When you have more customers via web & mobile, availability becomes a major concern. In a relational database deployed to a single physical server, if it fails database becomes unavailable. But NoSQL databases, partitions and distributes data into multiple database instances, The data can be replicated to one or more instances for high availability. If you use Oracle software, it requires separate software to replicate data, but with NoSQL, we don't need anything extra.
NoSQL has a simple API, When it comes to relational databases like SQL, It's not compatible with Java, Php or Python. But NoSQL databases, they have overcome this issue with their available APIs. When you use a NoSQL database technology, you don't need to understand about the underlying architecture of their system.
The relational data structure & in memory data structure of the application is different in SQL. When you use a NoSQL database, developers don't need to convert in memory structure of data to a relational structure, storage need for an ERP application is different from a storage need for facebook or twitter.
NoSQL supports for a huge amount of data, for modern web scale applications, you have millions of data from millions of customers., so better to go for NoSQL database technology.
3 Types of NoSQL databases
In NoSQL database technology, we have different types of NoSQL databases
- Key - value databases: It's least complex database option, stores data with an indexed key and a value, the value is a blob. It has primary key access, because of that has great performance and scalability, example - Azure table storage
- Column store databases: stores data tables as columns rather than rows, example - Hadoop, Cassandra
- Document databases: Document DBs great for managing data that document oriented, but somewhat structured, example - MongoDB, CouchDB. RavenDB, Documents can be in XML, JSON and BSON format. A document is a hierarchical tree like data structure consists of maps, collections & scalar values. In a document database, documents store value part of the key value store.
- Graph databases: Graph DB consist of interconnected data that can be represented as a graph, this data set is high in complexity. example - ArangoDB & OrientDB
4 Why we should use a NoSQL database
Let's see why we need to go for a NoSQL database,
We can improve programmer productivity when using a database technology that matches with application needs.
When you need to improve performance of large volume of data, better to use a NoSQL database. When you have more and more customers, you want to improve throughput of the application, You can reduce latency of data processing, you can improve throughput, no of transactions per second.
Most of the NoSQL databases are open source, you can download them and check whether you can achieve level of programmer productivity & performance by using a NoSQL database.
You can go for polyglot persistence, you can break your solution into different services and use a NoSQL database for a specific service, and for all the others use a relational database.
When you want to have a high scalable and super responsive data layer, go for NoSQL database
NoSQL databases are much faster in most type of operations, If you have high volume of data, NoSQL is usually the way to go as long as your data don't have loads of inter connected relationships
5 NoSQL: Myths
- NoSQL can't replace SQL, It's an alternative for specific requirements
- NoSQL is better/worse than SQL, some projects need to use a relational database, some needs to use both
- The language/framework determines the database, Php, MySQL(relational), Linux, on a LAMP server, MEAN stack - MongoDB, ExpressJS, Angular, Node and .NET, IIS, SQL server and Java, Apache, Oracle These technology stacks are practical and we can find commercial reasons why these are evolved. But don't think them as rules. We can use MongoDB with Php as well as .NET projects. We can use MySQL, SQL server with NodeJS.
An application can talk to multiple types of databases utilizing each for what it's best, relational, document db, key value store.
6 MongoDB - A leading NoSQL database
MongoDB is a free, open source, cross platform & document oriented database. It's based on NoSQL database technology. data records are stored as documents, it uses JSON like syntax.
In relational databases, we need to define exact schema, what are the tables, fields, and type of those fields. In NoSQL, need to plan out the structure of our database and collections, no need to do any pre defined structuring for schema.
7 MongoDB is the one
Let's see MongoDB is the right choice for your application, The data environment has changed a lot since SQL was first released,
If you are working with location based data, with a high volume of data better to use a NoSQL technology
Collect data from sensors and connected devices, they create millions of data points, it's a challenge for a relational database to check them and analyse them without time-consuming ETL process. But MongoDB can analyze data of any kind within the database itself.
Powering content management systems (CMS), MongoDB can store so many different types of data, can store multimedia files, tweets, and comments
MongoDB releases new versions fast, You can make modifications to your application without any cost, If you use relational database, It's a bit tedious to perform a version upgrade
We can build mobile apps fast since it can handle unstructured data, it's ideal for mobile apps
If your team has knowledge on NoSQL databases, MongoDB is the one of the easiest NoSQL based databases
When your application gets data from different location, go for MongoDB, If your application expects to grow, use MongoDB, MongoDB can be easily scaled horizontally
8 How MongoDB is changing today's business
Modern data is vast, unstructured, big & complicated, But clients have big expectations, this lead to the development of NoSQL databases
Document databases allows storing data in more logical manner
MongoDB is highly used in MEAN stack, but compatible with .NET and Java, it's open source & cross platform compatible, so you can download a copy of MongoDB and test whether it can fulfill your requirements,
9 Explore MongoDB
9.1 Install MongoDB
When you use MongoDB API in Azure CosmosDB, you don't need to install MongoDB locally, But just to understand how it works and how we can query documents let's install it and try some queries in local.
Let's go to MongoDB site, https://www.mongodb.com/ and download Mongo setup
Download community edition for Windows,
Double click on Mongo setup and you will get a window like this,
Click on Next button, and accept end user license agreement
MongoDB is getting installed
9.2 Setup MongoDB environment
Open up a command line in administration mode and go to the location where MongoDB is installed
create data and log folders in MongoDB installed location, inside data folder create a folder for db, all the database files will be stored in this db folder
navigate to bin folder in MongoDB installed location, and type following command, specify path to the database, file to store all mongoDB logs. This command allows to run MongoDB as a service
mongod --dbpath C:\mongodb\data\db --logpath C:\mongodb\log\mongo.log --logappend --rest --install
Run the following command to start MongoDB service
net start mongodb
Run Mongo shell from the bin directory, It shows Mongo shell version and Mongo local url if you want to access it over http
create a database by using the following command,
use movieDB
Type db command to check what is current db
db
You can see what are the available databases using this command,
show dbs
create a user by typing this command, assign that user readWrite and dbAdmin roles
db.createUser(
{
user: "hansamali",
pwd: "12345",
roles: [ "readWrite", "dbAdmin" ]
}
)
9.3 Create collection and Insert/read data
Collections are similar to tables in relational database, they hold documents inside collection
Create a collection by passing collection name.
db.createCollection('favouritemovies')
show collections
insert a document into favouritemovies collection
db.favouritemovies.insert({Title:"Spider-Man:Homecoming", Genre:"science fiction"});
Try to see available data in the collection by using the following command, _id field shows an object id, it's a unique value for a document
db.favouritemovies.find()
Let's try to insert multiple documents, and add a new field into the collection. In NoSQL databases we don't need to define a schema, we can add a new field into the collection without changing anything.
db.favouritemovies.insert({Title:"Wonder woman", Genre:"science fiction", Director:"Patty Jenkins"});
db.favouritemovies.insert([{Title:"The Mummy", Genre:"Fantacy/Thriller"},{Title:"Transformers", Genre:"science fiction"}]);
Use pretty() helper function to format the data output as below,
db.favouritemovies.find().pretty()
9.4 Update a field in a document
We want to add Director field to movie The Mummy, let's use update method in db, first parameter is, pass a match to a record, it acts as a where clause, next parameter is what we need to replace with, we should pass whole record to replace when we use update method
db.favouritemovies.update({Title:"The Mummy"}, {Title:"The Mummy", Genre:"Fantacy/Thriller", Director:"Alex Kurtzman"});
When we use set attribute to update a field, need to pass only Director field, no need to pass the whole record, It will only update Director field withour replacing full record
db.favouritemovies.update({Title:"Transformers"}, {$set:{Director:"Michael Bay"}});
9.5 Remove a field in a document
Let's remove a field from a document since NoSQL databases don't have a schema, we can unset a field from a specific document
db.favouritemovies.update({Title:"Transformers"}, {$unset:{Director:""}});
10 Azure CosmosDB
Azure DocumentDB to Azure CosmosDB, it's not only a rename, the next generation database technology. In Azure document DB it only supported document API, But in Azure CosmosDB, it supports for multiple APIs document API, MongoDB API, Graph API etc.
When you use CosmosDB, You can develop applications without borders, You can bring data close to your users,
Azure portal itself uses Azure Cosmos DB, Azure CosmosDB is elastically scalable database service, it's not possible with relational databases
When your application data changes a lot, you can choose document API or key-value pair API, If your application requires millions of actions per second, needs high speed, high availability, You can use CosmosDB in the cloud
As long as you don't perform joins, Azure CosmosDB is good in any kind of data, key-value pair, documents or graphs.
You can host a part of your data in an SQL database and another part in an Azure Cosmos DB
11 Conclusion
In Azure Cosmos DB we can use many types of APIs including Mongo API, Document - SQL API, Gremlin API, Table storage API. This article has described how to use Mongo API in Azure Cosmos DB including examples. You can move your MongoDB hosted in a server to Azure Cosmos DB without any major modifications to the application, your clients will not feel the difference of it. Cosmos DB is not supported with relational databases however you can host NoSQL databases with Cosmos DB along with a specific model and API
12 References
- Take a Modern Approach to Data in Your Apps
- Azure Cosmos DB with Scott Hanselman
- Azure Functions Cosmos DB Api Demo
- Journey from SQL to NoSQL with Azure Cosmos DB (formerly DocumentDb)
- Getting started with Cosmos DB
- Modeling Data for NoSQL Document Databases
- GOTO 2012 - Introduction to NoSQL
- O'Reilly Webcast: MongoDB Schema Design: How to Think Non-Relational
- How does Azure Cosmos DB index data?