Understanding Database and Log Performance Factors
Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2
This topic discusses database and log I/O performance factors in Microsoft Exchange Server 2010. An understanding of these factors is important to your Mailbox server storage design solution. For more information about other key aspects of the design process, see Mailbox Server Storage Design.
Contents
Transactional I/O
Understanding IOPS
Non-Transactional I/O
Transactional I/O
Transactional I/O is generally defined as the I/O generated by user activity. Examples of user activity include receiving, sending, and deleting items; syncing a Windows Mobile client; and logging on via Microsoft Office Outlook Web App.
Transactional I/O is a critical piece of Exchange 2010 storage design because the I/O latency (how long it takes to execute the I/O operation) can directly affect the user experience of online clients such as Microsoft Outlook Online Mode and Outlook Web App. Cached Exchange Mode in Outlook can also be affected by high I/O latency when it's being used for tasks such as delegate access and configuring rules. All clients can be affected by e-mail delivery delays caused by high latency I/O. Transactional I/O can be divided into database volume I/O and log volume I/O.
The transactional I/O requirements in Exchange 2010 have been reduced from those in Exchange Server 2007. Not all I/O that occurs against the Mailbox database and log volumes is considered transactional. For more information, see Understanding the Exchange 2010 Store.
Return to top
Understanding IOPS
For all versions of Exchange, it's important to understand the amount of database I/O per second (IOPS) consumed by each user because it's one of the key transactional I/O metrics needed for adequately sizing storage. The following sections discuss factors that affect IOPS when designing your Mailbox server role storage.
Database Cache
A 64-bit edition of the Windows Server operating system running the 64-bit version of Exchange 2010 substantially increases the virtual address space and allows Exchange to increase its database cache, reduce database read I/O, and enable up to 100 databases per server.
The database read reduction depends on the amount of database cache available to the server and the user message profile. For guidance about memory and databases, see Understanding the Mailbox Database Cache. Following the guidance in that topic can result in up to a 90 percent transactional I/O reduction over Exchange Server 2003. The amount of database cache per user is a key factor in the actual I/O reduction.
The following table demonstrates the increase in actual database cache per mailbox when comparing the default 900 megabytes (MB) of database cache per mailbox in Exchange 2003 versus 6 MB of database cache per mailbox in Exchange 2010 for a user population that uses a 100 messages / day profile. It's the additional database cache In Exchange 2010 that enables more read hits in cache, thus reducing database reads at the disk level.
Database cache sizes based on mailbox count
Mailbox count | Exchange 2003 database cache per mailbox (MB) | Exchange 2010 database cache per mailbox (MB) | Database cache increase over Exchange 2003 |
---|---|---|---|
4000 |
0.225 |
6 |
27 times |
2000 |
0.45 |
6 |
13 times |
1000 |
0.9 |
6 |
7 times |
500 |
1.8 |
6 |
5 times |
Return to top
Determining the Exchange 2010 Mailbox IOPS Profile
The two most significant factors that can be used to predict Exchange 2010 database IOPS are the amount of database cache per user and the number of messages each user sends and receives per day. The following table is based on a standard worker who uses Outlook 2010 in Cached Exchange Mode. The information has been tested to be accurate within plus or minus 20 percent. Other client types and usage scenarios may yield inaccurate results. The predictions are only valid for user database cache sizes between 3 MB and 30 MB. The information hasn't been validated in a scenario where users send and receive over 500 messages per day. The average message size for validation was 75 KB, but message size isn't a primary factor for IOPS.
The table provides estimated values for IOPS per user that you can use to predict your baseline Exchange 2010 IOPS requirements and includes all database I/O (database, content indexing, and NTFS metadata). It doesn't include log volume I/O.
Database cache and estimated IOPS per mailbox based on message activity
Messages sent/received per mailbox per day | Database cache per mailbox (MB) | Single database copy (stand-alone): Estimated IOPS per mailbox | Multiple database copies (mailbox resiliency): Estimated IOPS per mailbox |
---|---|---|---|
50 |
3 |
0.06 |
0.05 |
100 |
6 |
0.120 |
0.100 |
150 |
9 |
0.18 |
0.150 |
200 |
12 |
0.240 |
0.200 |
250 |
15 |
0.300 |
0.250 |
300 |
18 |
0.360 |
0.300 |
350 |
21 |
0.420 |
0.350 |
400 |
24 |
0.480 |
0.400 |
450 |
27 |
0.540 |
0.450 |
500 |
30 |
0.600 |
0.500 |
Mailbox resiliency refers to a unified high availability and site resilience solution in Exchange 2010. For more information, see Understanding High Availability and Site Resilience.
Return to top
Database Volume I/O
Database volume I/O is I/O associated with database file (.edb) read/write activity, content indexing read/write activity, as well as NTFS metadata read/write activity.
In Exchange 2003, the database read/write ratio is typically 2:1 or 66 percent reads. With Exchange 2010, the larger database cache decreases the number of reads to the database on disk causing the reads to shrink as a percentage of total I/O.
If you follow the recommended memory guidelines, you can expect to see the following I/O ratios for active database copies. For more information about the memory guidelines, see Understanding Memory Configurations and Exchange Performance. This measurement includes all database volume I/O (database, content indexing and NTFS metadata); it doesn't include log volume I/O.
Mailbox database I/O read/write ratios
Messages sent/received per mailbox per day | Stand-alone databases | Databases participating in mailbox resiliency |
---|---|---|
50 |
1:1 |
3:2 |
100 |
1:1 |
3:2 |
150 |
1:1 |
3:2 |
200 |
1:1 |
3:2 |
250 |
1:1 |
3:2 |
300 |
2:3 |
1:1 |
350 |
2:3 |
1:1 |
400 |
2:3 |
1:1 |
450 |
2:3 |
1:1 |
500 |
2:3 |
1:1 |
For example, if you deploy 24,000 mailboxes across Mailbox servers within a database availability group (DAG) that maintains three database copies, each database has a database read to write ratio of 3:2. Or, in other words, 60 percent of all I/Os to the logical unit number (LUN) hosting the database are read I/Os.
Having more writes as a percentage of total I/O has specific implications when choosing a redundant array of independent disks (RAID) type that has significant costs associated with writes, such as RAID5 or RAID6. For more information about selecting the appropriate RAID solution for your servers, see Understanding Storage Configuration.
Calculating IOPS per Mailbox Server
Calculating IOPS per Mailbox server in Exchange 2010 requires more steps than in previous versions of Exchange because of the following:
You can now combine databases and logs on the same volume,
You can host both active and passive database copies on the same server,
The addition of sequential I/O background tasks (for example, background database maintenance).
Pure sequential I/O operations aren't factored in the IOPS per Mailbox server calculation because storage subsystems can handle sequential I/O much more efficiently than random I/O. These operations include background database maintenance, log transactional I/O, and log replication I/O.
IOPS per Mailbox server is calculated slightly differently depending on how your storage is designed:
Database files and log files share a single volume.
Database files are stored on different disk volumes than the transaction log files.
For both storage designs, use Performance Monitor (perfmon.exe) to measure the peak two hour period (at a 5-second sampling interval). This is the time of day where the system is under the most load generated by client activity (for example, 10 A. M. -12 P. M.). This period is often twice the load of the 10 hour daily average (Peak:Average ratio = 2:1).
IOPS per Mailbox Server: Database Files and Log Files Share a Single Volume
In this configuration, the database files and log files are stored on the same disk volume. This example assumes each database is on a different volume backed by a dedicated disk. Fill in the following table for all databases from the collected performance monitor log (described in the previous section).
Database Name | Logical Disk -> Disk Reads/sec | Logical Disk -> Disk Writes/sec | MSExchange DatabaseInstances ->Database Maintenance IO Reads/sec | MSExchange DatabaseInstances ->I/O Database Reads (Recovery)/sec | MSExchange DatabaseInstances ->I/O Database Writes (Recovery)/sec | MSExchange DatabaseInstances ->IO Log Writes/sec |
---|---|---|---|---|---|---|
Database 1 |
||||||
Database 2 |
||||||
Database 3 |
||||||
Database 4 |
||||||
Any additional databases |
||||||
Total |
Add the totals from each column, and then perform the following calculation to determine IOPS per Mailbox server.
Calculation summary: Sum of Logical Disk IO - (sum of database maintenance IO + recovery (log replay) IO + Log IO) divided by the number of mailboxes hosted per server during the performance monitor log measurement.
Calculation Detail: ((Logical Disk -> Disk Reads/sec + Logical Disk -> Disk Writes/sec) - (MSExchange Database ==> Instances -> Database Maintenance IO Reads/sec + MSExchange Database ==> Instances -> I/O Database Reads (Recovery)/sec + MSExchange Database ==> Instances -> I/O Database writes (Recovery)/sec + MSExchange Database ==> Instances -> IO Log Writes/sec))/ Number of mailboxes hosted per server during the performance monitor log measurement = IOPS per Mailbox server.
Return to top
IOPS/Mailbox: Dedicated Database File Volume
In this configuration, the database files are stored on different disk volumes than the transaction log files. This example assumes each database is on a different volume backed by a dedicated disk. Fill in the following table for all databases from the collected perfmon log (described in the previous section).
Database Name | Logical Disk -> Disk Reads/sec | Logical Disk -> Disk Writes/sec | MSExchange Database ==> Instances ->Database Maintenance IO Reads/sec | MSExchange Database ==> Instances ->I/O Database Reads (Recovery)/sec | MSExchange Database ==> Instances ->I/O Database Writes (Recovery)/sec |
---|---|---|---|---|---|
Database 1 |
|||||
Database 2 |
|||||
Database 3 |
|||||
Database 4 |
|||||
Any additional databases |
|||||
Total |
Note
By default, the MSExchange Database ==> Instances ->Database Maintenance IO Reads/sec performance counter is not visible in Exchange 2010. You must enable this counter to view it. For more information about how to enable this performance counter, see How to Enable Extended ESE Performance Counters
To determine IOPS per Mailbox server, add the totals from each column and perform the following calculation.
Calculation summary: Sum of Logical Disk IO - (Sum of database maintenance IO + recovery (log replay) IO) divided by the number of mailboxes hosted per server during the perfmon log measurement.
Calculation Detail: ((Logical Disk -> Disk Reads/sec + Logical Disk ->Disk Writes/sec) - (MSExchange Database ==> Instances -> Database Maintenance IO Reads/sec + MSExchange Database ==> Instances -> I/O Database Reads (Recovery)/sec + MSExchange Database ==> Instances -> I/O Database writes (Recovery)/sec))/ Number of mailboxes hosted per server during the performance monitor log measurement = IOPS per Mailbox server.
Measure Baseline IOPS
If you're using a previous version of Exchange, and you have calculated your baseline IOPS, keep in mind that Exchange 2010 affects your baseline in the following ways:
The number of users on the server affects the overall database cache per user.
The amount of RAM influences how large your database cache can grow, and a larger database cache causes more cache read hits. This reduces your database read I/O.
The key to this process is that the IOPS on a specific server isn't enough information to plan an entire enterprise. This is because the amount of RAM, number of users, and number of databases will be different on each server. After you have your actual IOPS numbers, always apply a 20 percent I/O overhead factor to your calculations to add some reserve capacity. You don't want a poor user experience because activity is heavier than normal.
Desktop Search Engines and Outlook Online Mode Clients
Unlike Cached Exchange Mode clients, all Online Mode client operations occur against the database. Because of the changes in the store schema and Extensible Storage Engine (ESE), Outlook Online Mode clients now generate the same I/O profile as Outlook Cached Exchange Mode Clients.
In terms of mailbox search capabilities, end users have two options:
They can use the built-in content index that's available on the Mailbox server.
They can install a desktop search engine client and have a local index generated on the client of the mailbox's data and perform local searches.
End users that use desktop search engine clients with Outlook Online Mode may incur additional read I/O operations against the database. Currently, the only known desktop search engine that doesn't incur additional read I/Os is Windows Desktop Search 4.0. Windows Desktop Search 4.0 uses synchronization protocols that are similar to how Outlook Cached Exchange Mode synchronization protocols index the mailbox contents.
Therefore, use the following guidelines if you intend to deploy Outlook Online Mode clients with desktop search engines other than Windows Desktop Search 4.0:
256 MB Online Mode clients will increase database read operations by a factor of 1.5 when compared with Cached Exchange Mode clients. Below 256 MB, the impact is negligible.
As mailbox size doubles, the database read IOPS will also double (assuming equal item distribution between key folders remains the same).
As a result of this data, we have two recommendations:
Deploy Cached Exchange Mode clients where appropriate. For more information, see the "Item Count per Folder" section later in this topic. Otherwise, replace the desktop search engine with Windows Desktop Search 4.0.
Consider the I/O requirements when you're designing the database storage.
For additional IOPS factors, such as third-party clients, see Optimizing Storage for Exchange Server 2003.
Return to top
Log Volume I/O
Log volume I/O is I/O associated with database logging read/write activity and NTFS metadata read/write activity. Log volume I/O is sequential in nature and, when using a battery-backed write caching array controller, the I/O overhead of log volume I/O is minimal and not a significant factor for Exchange storage sizing.
Because of the reduction in database reads in Exchange 2010, combined with the smaller log file size and the ability to have more databases, the log-to-database write is 40 percent for stand-alone databases and 50 percent for databases that participate in mailbox resiliency. For example, if the database that's participating in mailbox resiliency consumes 12 write I/Os, the log LUN consumes approximately 6 write I/Os.
On Mailbox servers that are hosting databases that are participating in mailbox resiliency, there is overhead associated with using continuous replication. Closed transaction logs must be read and sent to the target database copies. This overhead is an additional 10 percent in log reads for each active database copy that's hosted on the Mailbox server. For example, if the Mailbox server is hosting 10 active database copies, and each transaction log stream is generating 6 write I/Os, you can expect an additional 0.6 read I/Os for each of those 10 active database copies (or a total of 6 read I/Os).
After you measure or predict the transactional log I/O, apply a 20 percent I/O overhead factor to ensure adequate room for busier-than-normal periods.
Item Count per Folder
One way to reduce server I/O is to use Outlook in Cached Exchange Mode. The initial mailbox synchronization is a disk intensive operation, but over time, as the mailbox size grows, the disk subsystem burden is shifted from the Exchange server to the Outlook client. With use of Cached Exchange Mode, having a large number of items in a user's Inbox or a user searching a mailbox will have little effect on the server. This approach also means that Cached Exchange Mode users with large mailboxes may need faster computers than those with small mailboxes (depending on the individual user threshold for acceptable performance).
When you deploy client computers that are running Outlook 2007 in Cached Exchange Mode, consider the following guidelines with respect to mailbox/.ost file sizes:
Up to 5 gigabytes (GB) This size should provide a good user experience on most hardware.
Between 5 GB and 10 GB This size is typically hardware dependent. Therefore, if you have a fast hard disk and a lot of RAM, your experience will be better. However, slower hard drives, such as drives that are typically found on laptops or early generation solid-state drives (SSDs), experience some application pauses when the drives respond.
More than 10 GB This is the size at which short pauses begin to occur on most hardware.
Very large, such as 25 GB or larger This size increases the frequency of the short pauses, especially while you're downloading new e-mail messages. Alternatively, you can use Send/Receive groups to manually synchronize your mail.
This guidance is based on the installation of a cumulative update for Outlook 2007 Service Pack 1 or later, as described in Microsoft Knowledge Base Article 961752, Description of the Outlook 2007 hotfix package (Outlook.msp): February 24, 2009.
If you experience performance-related issues with Outlook 2007 in Cached Exchange Mode deployment, see Knowledge Base Article 940226, How to troubleshoot performance issues in Outlook 2007. For more information about the improvements that are available, see Knowledge Base article 968009, Outlook 2007 improvements in the February 2009 cumulative update.
A challenging scenario occurs when a user has exceeded the number of indexes that Exchange will store. This is 11 indexes in Exchange 2010. When the user chooses to sort a new way, and thereby creates a twelfth index, this causes additional disk I/O activity. Because the index isn't stored, this additional disk activity cost occurs every time that this sort is performed. Because of the high I/O activity that can be generated in this scenario, we strongly recommend that you store no more than 100,000 items in core folders, such as the Inbox and Sent Items folders, and no more than 10,000 items in the Calendar and Contacts folders. The creation of more top-level folders, or of subfolders beneath the Inbox and Sent Items folders, greatly reduces the costs that are associated with this index creation. This is true as long as the number of items in any folder doesn't exceed 100,000.
Return to top
Content Index I/O
In Exchange 2010, messages are indexed as they're received, causing little database disk I/O overhead (because the message is still in the database cache when it's retrieved for indexing). However, write I/O is associated with updating the search catalog store. Because of the overall database I/O reductions in Exchange 2010, the percentage of search catalog I/O is now 10 percent to 15 percent of the database files I/O (depending upon profile). Search catalog read I/O occurs when clients issue search queries, and it's a rare enough occurrence not to be relevant to Exchange 2010 storage design.
Return to top
Non-Transactional I/O
Transactional I/O occurs in response to direct user action and usually has the highest priority, and therefore, it's the focus for storage design. Non-transactional I/O either occurs in the background and is tuned to have a minimal performance impact, or it occurs during a defined maintenance window.
The following sections discuss some of the non-transactional I/O that occurs in the background. Although non-transactional I/O isn't the focus of storage design, it can impact your storage design. For more information, see New Exchange Core Store Functionality.
Background Database Maintenance (Checksumming)
Background database maintenance I/O is sequential database file I/O associated with checksumming both active and passive database copies. Background database maintenance has the following characteristics:
On active databases, it can be configured to run either 24 × 7 or during the online maintenance window. Background database maintenance (Checksum) runs against passive database copies 24 × 7. For more information, see "Online Database Scanning" in the New Exchange Core Store Functionality topic.
Reads approximately 5 MB per second for each actively scanning database (both active and passive copies). The I/O is 100 percent sequential, so the storage subsystem can process the I/Os efficiently.
Stops scanning the database if the checksum pass completes in less than 24 hours.
Issues a warning event if the scan doesn't complete within three days (not configurable).
Messaging Records Management
Messaging records management (MRM) is the records management technology in Exchange 2010 that helps organizations reduce the legal risks associated with e-mail. MRM makes it easier to retain the messages that are needed to comply with company policy, government regulations, or legal needs, and to remove content that has no legal or business value.
These actions are accomplished through the use of retention policies or managed folders. The Managed Folder Assistant is a Microsoft Exchange Mailbox Assistant that applies message retention settings configured in retention policies or managed folder mailbox policies. The disk I/O required by the assistant depends on the number of mailbox items processed. We recommend that the assistant not run at the same time as either backup or online maintenance. For more information, see Configure the Managed Folder Assistant.
Online Maintenance
You can use the Exchange Management Tools to set the maintenance schedule for a database or allow 24 × 7 database maintenance. Online defragmentation no longer works in Exchange 2010 as it did in previous versions of Exchange. Online defragmentation is continuously performed while the database is being read from and written to. For more information, see "Online Database Scanning" in the New Exchange Core Store Functionality.
Return to top
© 2010 Microsoft Corporation. All rights reserved.