BizTalk: Latency Improvement Without Changing The Code
Introduction
This article describes an actual customer case with low-latency requirements in BizTalk Server. In this case, the average latency in BizTalk was reduced 22x without changing the code.
The Business Challenge
The customer was in the retail business, and currently working on a mobile phone app. BizTalk Server 2013 was used to handle back-end data integration. Development was almost complete, and load testing in progress in a pre-production environment. The performance was not as expected, due to their low-latency requirements. It was critical to keep the latency to a minimum due to the synchronous nature of the user interaction. With a large customer base, too high latency in this mobile phone app would be a showstopper.
Their criteria for approved latency were the following: Maximum response time of 1 second in 95% of HTTP calls. No calls should exceed 3 seconds. The response time was measured from an HTTP call was sent until the response was received.
Their current test results showed the following: Maximum response time of 60 seconds, with an average of 12,4 seconds.
Initial Analysis
The project team at this time consisted of a couple of BizTalk developers and several other IT professionals with various roles. They were stuck with the current test results and needed assistance. We were engaged as a short-term BizTalk consultant to analyze the situation and make improvement suggestions. There was one caveat, though. On a general basis, a lot of performance and latency improvements lies in the solution architecture, design and code. Due to deadlines and internal policies, code changes would not be allowed at this point. All my suggestions would have to be done outside of Visual Studio, in other words.
Short description of the BizTalk environment:
- Two standalone BizTalk Server 2013 Enterprise servers
- Two standalone SQL Server 2008R2 servers. One instance used for the MessageBox, another for the rest
- All servers running Windows Server 2008R2 Enterprise
The analysis was conducted using the following tools:
- BizTalk Administration Console
- SQL Management Studio
- SQL Server Profiler
- Performance Monitor
- Performance Analysis of Logs (PAL)
- BizTalk Health Monitor
- BizTalk Best Practices Analyzer
It is important to discover weaknesses in both BizTalk and SQL Server, in addition to the platform itself (hardware etc.).
Analysis Results and Recommendations
This section contains findings, related recommendations, and which recommendations were actually implemented. Some items are general optimizations, and others are specific for this scenario. The list is not sorted by significance/impact.
Observation | Recommendation | Implemented? |
DTC and COM+ updates missing | Install KB2693187 and KB5775511 | x |
Latest BizTalk CU not installed | Install latest CU on both BizTalk servers | x |
No dedicated Tracking host | Create a dedicated Tracking host | x |
Lots of artifact tracking enabled | Turn off as much artifact tracking as possible, especially MsgBodyTracking | |
Large (and growing) Tracking database | Adjust the DTA Purge and Archive SQL job to keep data less than the current 30 days | x |
Large files processed by BizTalk (<30MB) | Avoid large file processing in BizTalk. Consider redesign or SSIS | |
Custom monitoring tools which require custom code | Consider using BizTalk360 or SCOM instead | |
SQL Server AlwaysOn is enabled | AlwaysOn is not supported in BizTalk 2013, and should be disabled | x |
Several user databases (non-BizTalk) on the BizTalk instances | Keep BizTalk databases on dedicated instance(s) | |
Registry optimizations for BizTalk not found | Tune BizTalk servers using these Registry settings | x |
Unnecessary Windows Services running | Stop and disable Print Spooler | x |
Real-time antivirus scanning running | Create exceptions for BizTalk, or disable altogether | x |
CPU and RAM not sufficient on BizTalk and SQL servers | Upgrade CPU and RAM on all servers | x |
BizTalk hosts not optimized for low-latency | Design and implement a host design according to best practice, and tune them to achieve low-latency | x |
XMLTransmit and Receive pipelines in use | Use PassThru pipelines when possible | x |
BizTalk host instance CLR settings not optimized | Change settings according to best practice | x |
SQL send ports not optimized | Change settings for some ports | x |
Traffic from other integration applications in same environment | Consider a dedicated BizTalk environment if traffic from other applications is significant | |
Solution design and code not optimized for low-latency * | Perform code review, rewrite if needed. In general, stick to Messaging solutions when possible, avoid roundtrips to MessageBox, reduce persistence points, follow best practices. |
*) Included for reference, as this would not be done at this stage in the project
Only one change was implemented at a time, followed by new tests. The flowchart illustrates the process:
Implementation details for most items in the table above is self-explanatory. Here are some details for selected items:
Recommendation | Implementation Details |
Adjust the DTA Purge and Archive SQL job to keep data less than the current 30 days | @nHardDeleteDays changed from 30 to 14 |
AlwaysOn is not supported in BizTalk 2013, and should be disabled | Changed Availability Mode from Synchronous commit to Asynchronous commit. This was chosen due to non-BizTalk databases on the instances |
Tune BizTalk servers using these Registry settings | Used all Registry settings in this file |
Design and implement a host design according to best practice, and tune them to achieve low-latency | Created dedicated low-latency hosts, and configured the integration applications to use them.
Used the following settings for Processing:
|
BizTalk host instances setting: Change settings according to best practice | Configured the low-latency host instances using the following settings:
|
SQL Send Ports: Change settings for some ports | Changed port settings (SELECT queries only):
|
Final Results
As mentioned previously, testing was performed after each change. Here are the results of all changes described above. Note that the tests had some variance when running several times, and these are the best results.
Maximum response time of 1,1 seconds in 95% of HTTP calls. The overall maximum was 2,4 seconds, with an average of 564ms. This means the average latency was improved 22x without changing the code. It should be mentioned that the CPU and RAM upgrade had a big impact on performance.
Conclusion
The requirements were met with regard to maximum response time, but not entirely with regard to the 95 percentile. The results had some variance when running them consecutively without making changes in-between, for unknown reasons. The tests were based on a very high concurrent volume, and probably slightly unrealistic. The project was eventually launched after some delays. The mobile app has now been downloaded more than 100.000 times, with a high average user rating.
This case shows how much impact BizTalk Administration can have on critical projects. The results can be improved further by optimizing the code. It is important to keep such requirements in mind when designing and implementing low-latency projects.
See Also
- BizTalk Server 2013 Performance Optimization Guide
- Low-Latency Scenario Optimizations
- 10x latency improvement – how to squeeze performance out of your BizTalk solution
Another important place to find an extensive amount of BizTalk related articles is the TechNet Wiki itself. The best entry point is BizTalk Server Resources on the TechNet Wiki.