OData In The Cloud – One Of The Most Flexible And Powerful Ways To Provide Scalable Data Services To Virtually Any Client
Introduction
This post is dedicated to illustrating how you can create your own OData provider and host it in the cloud, specifically Windows Azure.
Open Data Protocol (a.k.a OData) is a data access protocol designed to provide standard CRUD access to a data source via a website. It is similar to JDBC and ODBC although OData is not limited to SQL databases.
OData can be thought of as an extension to REST and provides efficient and flexible ways for sharing data in a standardized format that is easily consumed by other systems. It uses well known web technologies like HTTP, AtomPub and JSON. OData is a resource-based Web protocol for querying and updating data.
OData performs operations on resources using HTTP verbs (PUT, POST, UPDATE and DELETE). It identifies those resources using a standard URI syntax. Data travels across the wire over HTTP using the AtomPub or JSON standards.
Generally speaking, I would data leverages relational databases as the data store. But what I would like to illustrate is how to leverage a simple text file as the data store. I believe this will give you a quick and easy introduction to the way everything works.
Internally at Microsoft there are many products that leverage OData.
Windows Azure Data Market |
Azure Table Storage uses OData, SharePoint 2010 allows OData Queries |
Excel PowerPivot. |
There are many advantages to OData |
---|
OData gives you an entire query language directly in the URL. |
The client only gets the data that it requests - no more are no less |
The client is very flexible, because it controls queries, not the server, which frees you from having to anticipate all the types of queries you need to support on the backend |
It can request the data in various formats, such as XML, JSON, or AtomPub |
Any client can consume the OData protocol |
You don't need to learn the programming model of a service to program against the service |
There are a lot of client libraries available, such as the as Microsoft .NET Framework client, AJAX, Java, PHP and Objective-C, and more. |
OData supports server paging limits, HTTP caching support, stateless services, streaming support and a pluggable provider model |
You can leverage LINQ as a query language |
Starting with government data
The city of San Francisco provides data available for download. So what I did is download crime statistics for the trailing three months. I reduced the 30,000 records to just a few hundred to make development a little bit easier.
One thing the example does not illustrate is how to make this extremely efficient by leveraging caching. This can be easily added to the project, but was avoided in the sake of simplicity.
We will use Visual Studio 2012 and will update some assemblies by using NuGet. That is an essential piece that is necessary for success.
Starting Visual Studio
Once you have Visual Studio up and running, choose File/New from the menu and select Cloud Project as seen below.
Add an ASP.Net Web Role to your solution, as seen below. There are other options, but this one is probably the most familiar to developers today. Click OK when finished.
Solution Explorer should look like this:
As you can see from figure above, there are two projects in the solution. The top one is for deployment purposes, while the bottom one is where we will add our OData code to get the job done.
Downloading data
You can navigate to the following URL to download some sample crime data. | https://data.sfgov.org/ |
I downloaded this data, removed some rows, and added it to the App_Data folder.
Note the file called, PoliceData.txt in the figure above.
Adding code
Now we are ready to start adding some code to process this data. We will begin by adding a couple of classes.
In Visual Studio, right mouse click on the web role and add a class as seen below. Name this class CrimeProvider.
There are some important points to notice about the code below.
There are two classes to note - CrimeData and CrimeProvider.
CrimeProvider.svc.cs | |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364 | using System;using System.Collections.Generic;using System.Data.Services.Common;using System.IO;using System.Linq;using System.Net;using System.Web;using Microsoft.Data.OData;namespace WebRole1{ [DataServiceKey("Incident")] public class CrimeData { public string Incident { get; set; } // col 0 public string CrimeType { get; set; } // col 2 public DateTime CrimeDate { get; set; } // col 4 public string Address { get; set; } // col 8 } public class CrimeProvider { private List<CrimeData> crimes = new List<CrimeData>(); public CrimeProvider() { WebRequest request = WebRequest.CreateDefault(new Uri(HttpContext.Current.Server.MapPath("~/App_Data/PoliceData.txt"))); WebResponse response = request.GetResponse(); using (StreamReader reader = new StreamReader(response.GetResponseStream())) { string data = reader.ReadToEnd(); LoadData(data); } } public CrimeProvider(string data) { LoadData(data); } private void LoadData(string data) { string[] rows = data.Split('\n'); for (int i = 1; i < rows.Length - 1; i++) { rows[i] = rows[i].Trim(); string[] cols = rows[i].Split('\t'); crimes.Add(new CrimeData { Incident = cols[0], CrimeType = cols[2], CrimeDate = Convert.ToDateTime(cols[4]), Address = cols[8] }); } } public IQueryable<CrimeData> Crimes { get { return crimes.AsQueryable(); } } }} |
3 ways to write a provider
There are three methods that can be used to create an Odata back end.
(1) EF Provider - easy to use
(2) Reflection Provider - what I used
(3) Custom Provider
The technique used today will be a reflection provider. The EF provider is another popular way that makes it easy to leverage a relational database using the framework. The Custom providers is more technically challenging, but offers the greatest flexibility.
Programming models in Visual Studio
There are two approaches within Visual Studio’s programming model that can be taken:
(1) Web API
(2) WCF Data Services
We will use the more traditional legacy approach in this post, called WCF Data Services, which starts at the data model. WCF Data Services starts with a client issued URI to query the data and then binds query to your data model and returns data to the client.
The Web API approach starts with things that you want to do, such as starting with customers and then looking up orders. We will explain the distinction more carefully in a future post. But in general, the trend is that most folks are moving to the Web API. In short, the advantage of Web API is that it gives you greater control over Http Request/Response and more closely models the MVC architecture. At the end of the day, the Web API gives you more control over how the client gets JSON data
But to keep this post moving along, let’s first show you how to implement OData using WCF data services.
In Visual Studio, right mouse click and add a new item, as seen below:
In the list box, select WCF data service, and provide the name. We are calling our data service, CrimeService.svc.
By adding a WCF Data Service a number of references are automatically added to your project.
Notice that there are four assemblies that start with “Microsoft.Data..” Also take note of the last one, “System.Data.Services.Client,” which will need to be removed. This was the biggest challenge to discover to get this to work properly and is not really very intuitive. The assemblies starting with “Microsoft.Data” will need to be upgraded to the latest version. We can do this by using a NuGet package.
Before doing the upgrade, let’s right mouse click on “System.Data.Services.Client,” and click “Remove.”
NuGet
NuGet is an awesome technology that lets you upgrade the assemblies in your project with just a couple of clicks. It can also add code to your project, but we are more interested about upgrading the OData assemblies at this time.
From Visual Studio’s tools menu, select “Manage NuGet packages for solution”
Notice that in the upper right text box, we typed in “Microsoft data” to search for teh right NuGet package.
we will install the NuGet package called OdataLib. This will indirectly upgrade a number of other packages as well.
We are not finished yet. To go to the updates portion of NuGet. We will need to update the WCF data services server as seen below:
After doing the upgrade, let’s right mouse click on “System.Data.Services.Client,” and click “Remove.”
Now that we have updated the assembly versions we need to go back to the CrimeService.svc and adjust for the new versions of the assemblies.
In this file we will eliminate the Version section, as seen below. Simply highlighted and then remove it from our CrimeService.svc file. Hit the delete key. It will just use the version of the assembly that is available.
The next step is to go modify some code, as seen in the figure below. Notice the TODO comment next to the DataService class. We will type in CrimeProvider here.
When finished your code should look like this:
CrimeService.svc.cs | |
12345678910111213141516171819202122232425262728 | //------------------------------------------------------------------------------// // Copyright (c) Microsoft Corporation. All rights reserved.// //------------------------------------------------------------------------------using System;using System.Collections.Generic;using System.Data.Services;using System.Data.Services.Common;using System.Linq;using System.ServiceModel.Web;using System.Web;namespace WebRole1{ public class CrimeService : DataService< CrimeProvider > { // This method is called only once to initialize service-wide policies. public static void InitializeService(DataServiceConfiguration config) { // TODO: set rules to indicate which entity sets and service operations are visible, updatable, etc. // Examples: config.SetEntitySetAccessRule("*", EntitySetRights.AllRead); // config.SetServiceOperationAccessRule("MyServiceOperation", ServiceOperationRights.All); config.DataServiceBehavior.MaxProtocolVersion = DataServiceProtocolVersion.V3; } }} |
The purpose of the modifications was to expose our CrimeProvider class to OData clients and the set the appropriate permission levels for the contained entities.
Ready to begin testing
There is only a couple things left to do before we can see our data exposed to a browser. The first thing is to make CrimeService.svc the startup file. Right mouse click on CrimeService.svc and choosing “Set as start page” from the menu.
Before deploying this to a Microsoft data center, we will test it in the local Azure emulator, which allows to run locally before going to the trouble of deploying it.
In Visual Studio, go to the debug menu and choose start debugging. The browser should pop up and look like this:
Deploying to Windows Azure
Our project now is ready for deployment cloud. We will perform more queries and tests with the underlying data after deployment.
The assumption is that you have a Windows Azure account to test with.
There are a variety of ways to deploy your project. Some of them can be very fast and efficient and can be automated. We will take a more manual approach in the spirit of clarity and ease.
How to Create and Deploy a Cloud Service | https://www.windowsazure.com/en-us/manage/services/cloud-services/how-to-create-and-deploy-a-cloud-service/ |
Log into the Windows Azure portal and select cloud services, then click new.
Select Quick Create, specify a URL, and choose a Region. Finally, click the Create Cloud Service button in the lower right.
You will notice the cloud service now available for us to deploy our project to.
Return to the Visual Studio, where we will create a deployment package.
Accept the default and click Package.
After the packages have been built, an Explorer window will pop up and show you the files that comprise the package that will need to be uploaded to the portal. Make sure to note the path where your package is located. In my case, the path is here:
C:\MSVirtualAcademy\ODataInAzure\ODataInAzure\bin\Release\app.publish
Return back to the portal and make the following selections, as seen by the red boxes below. Once you have done so, you will be able to upload the package that we previously created.
Notice that we needed to provide 4 things:
deployment name |
package file |
configuration file |
deployment even if you want a single instance |
We chose a single instance deployment. This means that there will be only be one cloud service running at a time. However, Visual Studio does provide a way for you to change the instance count, and even the size of the virtual machine. If you want detailed instructions on how to do this, you should download the Windows Azure training kit, easily found by using Bing.
Instance count is the way that you can scale your service, depending on how much power you need. There are many ways you can scale the instance count up or down. It can be done programmatically and it can even be done based on performance counters, making the process automatic, capable of going up or down based on demand. This is one of the core value propositions about cloud computing.
Here is the dialog box within Visual Studio that lets you change the instance count in a manual approach.
You will see the following message at the portal:
You will see a check mark icon in the lower right corner. Once you click on it the deployment process will begin. The deployment will take about 5 to 10 minutes to complete. The dashboard can tell you the progress you are making during the deployment.
Once the deployment has completed, you can start to query your data:
Navigating to the deployment is as simple as the figure below:
This is where I would data gets really interesting. The URL is how the client can control the data that returns back from the server across the wire. The beauty of this approach is that it is very flexible in that you do not need the hardcode predefined queries on the server. Instead, you can let the client to find them through a URL. This also means the client does not need to filter out extra data, because the URL defines exactly what data is expected.
The figure below is getting the collection type that is available. we can drill in further to get a list of all crimes by appending "Crimes" to the URL as seen in the next diagram.
Getting a listing of all crimes:
Here is the client requesting a specific crime, by incident number:
Here are some other Odata queries that you can play around with:
Get a specific crime by incident |
https://myodata.cloudapp.net/CrimeService.svc/Crimes('130632797') |
Conclusion
The purpose of this post was to demonstrate how you can host of OData-based data services in the cloud. We happened to leverage text files as the data source, but using some of the techniques demonstrated, you could leverage almost any data type.
Regarding deployment to the cloud, you can automate this significantly.
As you gain more experience with the tooling, you can do a deployment with the simple right mouse click and choosing Publish.