HealthVault: Batching up queries
When I first started using the HealthVault SDK, I wrote some code like this, based on what I had seen before:
HealthRecordSearcher searcher = PersonInfo.SelectedRecord.CreateSearcher();
HealthRecordFilter filter = new HealthRecordFilter(Height.TypeID);
searcher.Filters.Add(filter);
HealthRecordItemCollection items = searcher.GetMatchingItems()[0];
So, what's up with indexing into the result from GetMatchingItems()? Why isn't it simpler?
The answer is that queries can be batched up into a single filter, so that you can execute them all at once. So, if we want to, we can write the following:
HealthRecordSearcher searcher = PersonInfo.SelectedRecord.CreateSearcher();
HealthRecordFilter filterHeight = new HealthRecordFilter(Height.TypeId);
searcher.Filters.Add(filterHeight);
HealthRecordFilter filterWeight = new HealthRecordFilter(Weight.TypeId);
searcher.Filters.Add(filterWeight);
ReadOnlyCollection<HealthRecordItemCollection> results = searcher.GetMatchingItems();
HealthRecordItemCollection heightItems = results[0];
HealthRecordItemCollection weightItems = results[1];
Based on a partner question today, I got a bit interested in what the performance advantages were of batching queries up. So, I wrote a short test application that compared fetching 32 single Height values either serially or batched together.
Here's what I saw:
Batch Size | Time in seconds |
---|---|
1 | 0.98 |
2 | 0.51 |
4 | 0.28 |
8 | 0.16 |
16 | 0.10 |
32 | 0.08 |
This is a pretty impressive result - if you need to fetch 4 different items, it's nearly 4 times faster to batch up the fetch compared to doing them independently. Why is this so big?
Well, to do a fetch, the following thing has to happen:
- The request is created on the web server
- It is transmitted across the net to HealthVault servers
- The request is decoded, executed, and a response is created
- It is transmitted back to the web server
- The web server unpackages it
When a filter returns small amounts of data, steps 1, 3, and 5 are pretty fast, but steps 2 and 4 involve network latency, which dominates the elapsed time. So, the batching eliminates those chunks of time, and we get a nice speedup.
We would therefore expect that as we fetch more data in each request, batching would be less useful. Here is some data for fetching 16 items:
Batch Size | Time in seconds |
---|---|
1 | 1.40 |
2 | 0.91 |
4 | 0.66 |
8 | 0.49 |
16 | 0.42 |
32 | 0.39 |
Which is pretty much what you would expect.
Comments
- Anonymous
January 18, 2008
PingBack from http://msdnrss.thecoderblogs.com/2008/01/18/healthvault-batching-up-queries/