Introduction to AZURE COSMOS DB

Azure Cosmos DB  is a multi-model database service by design, which can be easily globally distributed. Database engine supports storing of data in documents, key-value pairs and even graphs. Scaling of Cosmos DB across any number of available locations is extremally easy –  just press the appropriate button in the Azure portal. In modern web-based applications low latency is expected by the end users. With Cosmos DB you can store data closer to application users. Database can be distributed and available in 50+ regions. This creates enormous opportunities. Regions management can be involved at any time in application lifecycle.

Based on the above, global distribution of data with Cosmos DB provides a set of benefits such as:

  • support for NoSQL approach for data management,
  • easy management of massive data amounts (read, writes operations close to end users),
  • simplicity of cooperation with mobile, web or even IoT solutions,
  • low latency,
  • high throughput and availability.

For development purpose Microsoft provide Azure Cosmos DB emulator. Functionality is close to native cloud version of Cosmos DB. Developer will be able to create and query JSON documents, deal with collections and test stored procedures or triggers on database level. We need to understand that some features are not fully supported locally. These are among others multi-region replication or scalability.

Later in this post I will try to explain more details about supported data models. All of them use main, cool features provided by Azure Cosmos DB.

Supported data models

1. SQL API

This kind of Cosmos DB API provides capabilities to dealing with data for users, which are familiar with SQL queries standards. In general, data is stored as a JSON, but we can query them in easy way with SQL-like queries. Communication is handled by HTTP/HTTPS endpoints which process several requests. Microsoft provide dedicated SDK’s for this kind of API, for most of popular programming languages like .NET, Java, Python or JavaScript. Developers can load dedicated library in their application and start very fast read/write operations directly to Cosmos DB. Sample flow has been shown below.

Azure Cosmos DB - Multi model, Globally Distributed Database Service

 

2. MongoDB API

Existing instances of MongoDB can be migrated to Azure Cosmos DB without huge effort for this activities. Both standards are compatible. If new environment is created, change between native MongoDB instance and Cosmos DB instance (by MongoDB API) comes to change a connection string in application. Existing drivers for application written for MongoDB are fully supported. By design all properties within documents are automatically indexed.

Let’s check, how simple queries for identical documents collection as used in previous point will look like:

 

Azure Cosmos DB - Multi model, Globally Distributed Database Service

 

As a result, specified sub-JSON contains data will be returned. If query doesn’t return results, empty object will be send as a response to query.

3. Table API

This kind of API can be used by applications prepared natively for close working with Azure Storage tables. Of course Cosmos DB provides some premium capabilities comparing to Storage tables e.g. high availability or global distribution. Migration to new DB source for application doesn’t require changes in code. User can query data in a few ways. Also lot of SDKs are provided by design. Below sample will show how to query data by .NET SDK with LINQ. During execution LINQ query will be translated to ODATA query expression.

Azure Cosmos DB - Multi model, Globally Distributed Database Service

 

4. Cassandra API

Azure Cosmos DB Cassandra API is dedicated data store for applications created for Apache Cassandra. User is able to interact with data via CQL (Cassandra Query Language). In many cases action for changing DB source from Apache Cassandra to Azure Cosmos DB ‘s Cassandra API is just changing a connection string in application. From code perspective integration with Cassandra is realized via dedicated SDK (NuGet -> Install-Package CassandraCSharpDriver). Sample code for connecting to Cassandra cluster from .NET application is presented below.

Azure Cosmos DB - Multi model, Globally Distributed Database Service

 

5. Gremlin API

The last API provided by Azure Cosmos DB (on the day of writing this article 😉) is Gremlin API. This kind of interface can be used for storing and operation on graph data. API supports natively possibilities to graph modeling and traversing. We can query the graphs with millisecond latency and evolve the graph structure and schema in easy way. For querying activities we can use Gremlin or Apache TinkerPop languages. Step by step process from NuGet package installation to run first query is has been shown below.

Azure Cosmos DB - Multi model, Globally Distributed Database Service

Summary

From the developer perspective, Azure Cosmos DB is very interesting service. Huge range of available APIs allows for using mentioned database in various scenarios. Below you can find information from official Azure Cosmos DB site about availabilities of APIs per programming language.

Azure Cosmos DB - Multi model, Globally Distributed Database Service

Source: Azure Cosmos DB Documentation

***

This post is the last part in our Azure DevOps series. Check out the previous posts:

#1: Azure DevOps Services – cloud based platform for collaborating on code development from Microsoft

#2: Web application development with .NET Core and Azure DevOps