New Azure CosmosDB change feed processor released!

Introduction

Microsoft Azure CosmosDB is an umbrella over different database services. One of them is the schema free database service called DocumentDB. You can read about it here, it has many great features and components. One of them is "change feed" which represents the feed of changed documents (upserts, replace) . The client component exposing this feed is called change feed processor.

 

Change feed processor works by querying DocumentDB collection for changes. Then, it outputs the list of documents to the registered observer in the order in which they were modified. The observer can process the changes. After observer processed last document from the batch, the change feed processor could checkpoint (persists) the pointer of last document from the batch into the lease document (optional). The whole process then repeats.  That's all folks!

 

We use change feed quite heavily. It our core component. We built an event sourcing system with capture state pattern but I'll write about it in future posts.

Now, let's focus on the change feed processor itself and what's possible with this component.

Package

Microsoft recently released new version (2.0.4) of change feed processor package.  It's client side library built on top of DocumentDB SDK. The source code is located here. I'm one of the contributors to this project.

The newly released package changed the new major package version, 2.x which means there were major changes, refactoring, etc. Upgrading the package from 1.x is seamless but there will be several warning messages generated about obsolete interfaces/classes usage.

New package deprecates old construction logic and introduce new one. That's the main focus of this blog post.

 

High level

The following picture shows the idea behind the scene.

As you can see, there are 3 parts which need to be provided from the developer's perspective:

  • Data collection - it's known also as a feed collection. It's a collection of the data over which change feed is processed. It could be created as a partitioned collection so that data are split into separated physical locations.
  • Lease collection - it's a collection where the change feed keeps the pointer of last processed document per partition.
  • Observer - a business logic responsible for processing the documents delivered over change feed.

 

Creating this whole machinery was changed and provides you different extensibility options. That is the focus of this series of blog posts about change feed.

 

Construction

It's possible to create change feed processor manually but the preferred way is to use the builder Microsoft.Azure.Documents.ChangeFeedProcessor.ChangeFeedProcessorBuilder. The most simple version is following:

 

 

Sample

I created whole sample showing this in the action. Source code is at https://github.com/kadukf/blog.msdn/tree/master/CosmosDB/ChangeFeed/Sample1/DocumentDB.ChangeFeedProcessor.ConsoleApp  The sample requires local Azure CosmosDB emulator  running. The sample console application ensures that the database with feed collection (Input) as well as lease collection (Input.Lease.ConsoleApp) is created.

Then it starts the thread which inserts the documents and then it starts the change feed processing machinery with the console logging observer.

 

The console observer logs  when the partition processing is "opened", "closed" as well as when the documents are dispatched for processing.

Because the sample requires at least 2 collections and  Azure CosmosDB emulator supports wheter several single partition collections or  1 multipartition collection, just single partition collection is created in this case, so the processing partition ID is 0.

Comments

  • Anonymous
    August 06, 2018
    Great article Kaduk, thanks!