Should a performance cache query run through your EAI hub?
When you pass a message from one system to another, you have to decide: do I want the message to pass quickly, or do I want to be certain that the message gets there? But what does that do to decoupling? Here's a specific case. Take a look at the decisions made and tell me: would you make the same choices?
We have a new system coming in to existence that will create a business document using a new web interface. The legacy system that sits underneath also creates the same business document in a less-than-appealing manner. We put in the new system so that we can move people from the old system to the new one. Once everyone moves over, we can consider decommissioning the old interface. Eventually, we will replace the back end as well.
To see an image of the roadmap in another window, click here.
Of course, the devil is in the details. We are in step 1: adding the new interface to the legacy system.
For now, the legacy datastore has the domain data that the interface needs. There are two points here: small domains (like list of U.S. States or Document type codes) and large domains (like lists of Customers or Products). In the latter category, we have one table in the legacy db that is quite large, so when the user needs a record from this table, we create a query message from the new system to the old to get search results. Note that in the future, this data will probably come from the performance cache or directly from an upstream system. It does not "belong" to the legacy app.
The user is waiting on this query to return. This is important.
The question is this: we are trying to keep the two systems as decoupled as possible. In that vein, all other transactions between the systems happen through a Biztalk interface. We can handle orchestration, indirection, mapping, and isolation using Biztalk.
However, for this looking up data in the large table, we want to get the search results data quickly. We don't need to translate the fields. We need message speed, but not message reliability. If the user cannot search, then the process of creating the business document is stopped anyway. (The process is similar to creating an order online... if you can't see the product catalog, it's hard to create the order). The data source is highly reliable already, so there is no need to improve that in the messaging system.
So I have to consider: do we make a direct dependency between the new system and the old one by adding a direct call, completely circumventing Biztalk, or do we keep ALL of our connection running through Biztalk so that we can maintain all relationships in one place?
An image that illustrates this choice is here. The choice is whether or not to put in the direct dependency, illustrated as a blue arrow.
The development team chose to go ahead and add this dependency.
For Reliable EAI calls, the transactions run through Biztalk. For direct queries, the developers went with a seperate direct call to get the search results. An additional direct call was added to get the rest of the domain data.
So, now the two systems are connected through two pathsways. One dependency runs directly to get domain data from legacy to new, while the other runs through a messaging interface.
The reasons for this decision are obscured from me at the moment, but I believe that, when I ask, I will hear: we wanted performance and it is slower to run through BTS (somewhat true).
My choices:
1) Approve the design and don't worry about it. (always an option)
2) Ask that the performance cache be more formalized in future releases, so that I'm sure that the dependencies are centrally managed and that the cache isn't treated as a new data master. This may add complexity, and a constraint or two, but probably won't affect performance.
3) Kill the additional dependency and require that all data queries run through the Biztalk engine.
(I'm leaning towards #2.)
What do you think?
Comments
Anonymous
January 20, 2007
Nick - I'm leaning toward #2 or #3 here because once the precedent for and exception like this is established, it will come up again and again. In time, it's no longer a precedent, but a de facto architectural standard. Has there been any strawman performance testing or simulation of this? Have the data volumetrics involved been measured or at least estimated? If speed is an issue there are a number of ways of optimizing the framework such that direct DB calls don't have to be made. I'm thinking more along the lines of data model and or DB server optimizations that could be applied while maintaining the integrity of the SOA structure - kind of a psuedo-cache, if you will. Option 1 is a band-aid that will eventually turn into a prosthetic...take a look at the other technical alternatives first.Anonymous
January 21, 2007
Nick-is the performance concern related to the query time on the legacy DB or is it the size of the payload being returned across your message bus? If it is the former, you will most likely not gain a noticeable performance gain because the bottleneck is on the source DB. If the payload size is the issue, then creating a point to point will optimize performance. As the previous reply stated, I would stick with #2 unless the expected performance is beyond the user's expectations. BrianAnonymous
January 21, 2007
Hi Brian, The dev team is not all that familiar with SOA, and for them, anything having to do with Biztalk might just as well be a "Houdini call" (magic, even though they know better). I think the concern is that any call to an EAI bus has to be 'slow' because it's a call to an EAI bus, and not directly to the db. Aside from the fact that adding the call was a break from the design, I don't think it was wrong, per se. It means that the design needs to take the performance cache into account in Step 1, and not to wait until Step 3 to formalize it. The architect is not always right. (I wish I were THAT good). I don't agree with the reasons, really, because I don't see it as a performance issue. That said, there are clear advantages to having a performance cache (which is why it was on the roadmap in the first place), and there is no particular reason why calls to the performance cache need to run through the EAI structure. All in all, I'm not able to come up with a good reason to force the team to option #3 other than purity of design, which is a pretty thin reason. Time to update the design spec. I'm going with option #2.Anonymous
January 22, 2007
The comment has been removed