Monitoring Performance
Monitoring Performance
To maintain system performance, you must monitor your server to identify potential bottlenecks. Windows Server 2003 provides the Performance tool. This allows you to monitor generic and application-specific components.
Generic Performance Objects
There are a number of generic performance counters that you should monitor with Performance for any server system. The following table outlines these generic Performance Objects.
Performance Object |
Description |
Processor |
You should monitor processor performance to ensure that all processor usage does not remain consistently high (over 80 percent). |
Network Interface |
Monitor the rate at which data is sent and received via the network interface card. This should remain below 50 percent of network capacity. |
Disks and Cache |
There are a number of logical disk options that you should monitor regularly. The available disk space is essential in any capacity study, but you should also review the time that the disk is idle, which may indicate that disks are overloaded. Dependent on the types of applications or services you are running on your servers, you may review disk read and write times. Extended queuing for write or read function will affect performance. The cache has a major effect on read and writes operations. You must monitor for increased cache failures. |
Memory and Paging File |
Monitor the amount of physical memory available for allocation. Insufficient memory will lead to excessive use of the page file and an increase in the number of page faults per second. |
SharePoint Performance Objects
Windows SharePoint Services 3.0 and Office SharePoint Server 2007 provide additional performance objects that you may use with the Performance tool. These focus on search and indexing functions. The following table outlines these SharePoint performance objects.
Performance Object |
Description |
SharePoint Search Archival Plug-in |
This performance object reports on the number of queued documents. As the number of queuing documents increases, user request time will also increase. |
SharePoint Search Gatherer and SharePoint Search Gatherer Process |
The Search Gatherer object reports the number of threads that are in a wait state while a document access request is serviced. As this number increases, user request response time will also increase. The Search Gatherer Process indicates the number of documents waiting to be processed. An increase in this number may indicate a processor bottleneck. |
SharePoint Search Indexer Catalog |
This object shows the size and the status of indexing processes, such as merges. |
The Performance tool enables you to create traces to provide continuous records of performance object activity. You can do this using the Windows interface or from a command line, if you want to include Event Trace Session logs or Performance logs in a script. The command line command is logman.exe. You can use the command syntax to create, start, and stop monitoring. The example below creates a counter that logs process active time as a percentage over two hours on a given date, and then saves the data to a log file.
logman create counter Processor_log –b 6/6/2007 13:00:00 -e 6/6/2007 15:00:00
-r -v mmddhhmm -c "\Processor(_Total)\% Processor Time" "\Memory\Available bytes"
-si 00:15 -o "c:\perflogs\daily_log"
Full syntax is available by typing the following command on a command prompt.
logman -?
System Counters
The following table provides information on system objects and counters.
Objects and Counters |
Description |
Processor - % Processor Time |
This shows processor usage over a period of time. If this is consistently too high, you may find performance is adversely affected. Remember to count “Total” in multiprocessor systems. You can measure the utilization on each processor as well, to ensure balanced performance between cores. |
Disk |
|
- Avg. Disk Queue Length |
|
- % Idle Time |
There are a number of logical disk options that you should review. You need to monitor peak disk activity. The percentage of idle time shows if disks are overloaded. |
- % Free Space |
You may want to check each logical disk. You should also always be aware of the available disk space. To further assess your logical disks, investigate Disk Write/sec and Disk Reads/sec. |
Memory |
|
- Available Mbytes |
This shows the amount of physical memory available for allocation. Insufficient memory will lead to excessive use of the page file and an increase in the number of page faults per second. |
- Cache Faults/sec |
This counter shows the rate at which faults occur when a page is sought in the file system cache and is not found. This may be a soft fault, when the page is found in memory, or a hard fault, when the page is on disk. The effective use of the cache for read and write operations can have a significant effect on server performance. You must monitor for increased cache failures, indicated by a reduction in the Async Fast Reads/sec or Read Aheads/sec. |
- Pages/sec |
This counter shows the rate at which pages are read from or written to disk to resolve hard page faults. If this rises, it indicates system-wide performance problems. |
Paging File |
|
- % Used and % Used Peak |
The server paging file, sometimes called the swapfile, holds “virtual” memory addresses on disk. Page faults occur when a process has to stop and wait while required “virtual” resources are retrieved from disk into memory. These will be more frequent if the physical memory is inadequate. |
NIC |
|
- Total Bytes/sec |
This is the rate at which data is sent and received via the network interface card. You may need to investigate further if this rate is over 40-50 percent network capacity. To fine-tune your investigation, monitor Bytes received/sec and Bytes Sent/sec. |
Process |
|
- Working Set |
This counter indicates the current size (in bytes) of the working set for a given process. This memory is reserved for the process, even if it is not in use. |
- % Processor Time |
This counter indicates the percentage of processor time that is used by a given process. |
ASP.NET |
|
Requests Queued |
Windows SharePoint Services 3.0 provides the building blocks for HTML pages that are rendered in the user browser over HTTP. This counter shows the number of requests waiting to be processed. |
- Request Wait Time |
As the number of wait events increases, users will experience degraded page-rendering performance. |
- Requests Rejected |
This counter indicates the number of requests that were rejects because the queue was full. |
SharePoint Products and Technologies Counters
The following table provides information on SharePoint Products and Technologies objects and counters.
Objects and Counters |
Description |
Search Archival Plug-in |
|
Blocked documents |
This indicates the number of documents waiting in a queue. When this number grows, users will suffer degraded performance. |
Search Gatherer |
|
Idle Threads |
This indicates the number of threads, or processes, waiting for documents. An increase in this number may lead to reduced performance for users. |
Search Gatherer Process |
|
Waiting Documents |
This is the number of documents waiting to be processed. When this number goes to zero, the catalog is idle. This number indicates the total queue size of unprocessed documents in the gatherer. If this number increases, it may indicate a processing bottleneck. |
SQL Server Counters
The following table provides information on SQL Server Counters objects and counters.
Objects and Counters |
Description |
General Statistics |
|
User Connections |
This counter shows the amount of user connections on your SQL Server. If you see this number rise by 500 percent from your baseline, you may see a performance reduction. |
Databases |
|
Transactions/sec |
This counter shows the amount of transactions on a given database or on the entire SQL Server per second. This number is more for your baseline and to help you troubleshoot issues. |
Locks(_Total) |
|
Number of Deadlocks/sec |
This counter shows the number of deadlocks on the SQL Server per second. This should not rise above 0. |
Lock Waits |
This counter shows the number of locks per second that could not be satisfied immediately and had to wait for resources. |
Removing Bottlenecks
System bottlenecks represent a point of contention where there are insufficient resources to service user transaction requests. These may be physical hardware, operating environment, or application-based. For a system administrator, it is essential to manage bottlenecks by constantly monitoring performance. When you identify a performance issue, you must assess the best resolution for removing the bottleneck. The Performance counters and other performance monitoring applications, such as System Center Operations Manager (SCOM) 2007, are the key tools in tracking and analyzing problems, so that you can develop a solution.
Physical Bottleneck Resolution
Physical bottlenecks are based on processor, disk, memory, and network contention: too many requests are contending for too few physical resources. The objects and counters described in the Monitoring Performance topic indicate where the performance problem is located, for example, hardware processor or ASP.NET. Bottleneck resolution requires that you identify the issue and then make a change or changes that mitigate the performance problem.
Problems seldom happen instantaneously; there is usually a gradual performance degradation that you can track if you monitor regularly, using your Performance tool or a more sophisticated system, such as SCOM. For both of these options, to varying degrees, you can embed solutions within an alert, in the form of advisory text or scripted commands.
You often have to resolve bottleneck issues by making changes to hardware or system configurations. The following tables identify problem threshold and possible resolution options. Some of the options suggest hardware upgrades or modifications. Clearly, you must implement these within the bounds of a well-structured maintenance schedule.
Objects and Counters |
Problem |
Resolution Options |
Processor - % Processor Time |
Over 75-85% |
· Upgrade processor · Increase number of processors · Add additional server(s) |
Disk |
||
- Avg. Disk Queue Length |
Greater than 2 |
· Increase number or speed of disks · Change array configuration to stripe · Move some data to an alternative server |
- % Idle Time |
Greater than 90% |
· Increase number of disks · Move data to an alternative disk or server |
- % Free Space |
Greater than 70% |
· Increase number of disks · Move data to an alternative disk or server |
Memory |
||
- Available Mbytes |
Less than 4Mb |
· Add memory |
- Cache Faults/sec |
Greater than 1 |
· Add memory · Increase cache speed or size if possible · Move data to an alternative disk or server |
- Pages/sec |
Greater than 10 |
· Add memory |
Paging File |
||
- % Used and % Used Peak |
The server paging file, sometimes called the swapfile, holds “virtual” memory addresses on disk. Page faults occur when a process has to stop and wait while required “virtual” resources are retrieved from disk into memory. These will be more frequent if the physical memory is inadequate. |
· Add memory |
NIC |
||
- Total Bytes/sec |
This is the rate at which data is sent and received via the network interface card. You may need to investigate further if this rate is over 40-50 percent network capacity. To fine-tune your investigation, monitor Bytes received/sec and Bytes Sent/sec. |
· Reassess network interface card speed · Check number, size, and usage of memory buffers |
Process |
||
- Working Set |
This counter indicates the current size (in bytes) of the working set for a given process. This memory is reserved for the process, even if it is not in use. |
· Add memory |
- % Processor Time |
This counter indicates the percentage of processor time that is used by a given process. |
· Increase number of processors · Redistribute workload to additional servers |
ASP.NET |
||
-Requests Queued |
Windows SharePoint Services 3.0 provides the building blocks for HTML pages that are rendered in the user browser over HTTP. This counter shows the number of requests waiting to be processed. |
· Implement additional Web servers · The default maximum for this counter is 5,000, and you can change this setting in the Machine.config file |
- Request Wait Time |
As the number of wait events increases, users will experience degraded page rendering performance. |
· Implement additional Web servers |
- Requests Rejected |
This counter indicates the number of requests that were rejects because the queue was full. |
· Implement additional Web servers |
Comments
Anonymous
June 09, 2009
The comment has been removedAnonymous
January 08, 2010
Yes, perhaps this should be part of the OS but it isn't. We are using Advanced Host Monitor software to monitor our server performance. It can check Performance Counter objects and much more (ping, mail, wmi, traffic, snmp, and so on).Anonymous
April 06, 2010
This content was originally provided in the following Microsoft published whitepaper: http://www.sharepointjoel.com/Presentations/Whitepapers/Determine%20Capacity%20Planning%20SharePoint_AM102572461033.doc