Why does your code take so long? Profile it!
Why does your code take so long to run? How do you improve its performance? When examining code, it’s very difficult to know which piece takes the longest, and thus should be the target of you optimization efforts.
For example, you won’t see much benefit from optimizing a part of your code if that code actually takes only 5% of the time.
Take a look at the code below and see if you can figure out where the most time is spent. Clue: you can increase the speed by 50% with a very minor change, yet do all the same work (don’t just decrease the # of iterations J).
Visual Studio 2010 comes with an easy to use profiling tool to figure out the slow parts of your application.
Start VS, File->New->Project->VB WPF Application. Call it PerfTest.
Paste in the code below to replace MainWindow.Xaml.VB
The code is chopped up sample code form this post What is your computer doing with all that memory? Write your own memory browser to demonstrate code profiling.
Notice that it shows a MessageBox before running some code in a loop. When the code continues, the sample code that we want to profile gets a map of all the virtual memory in the current process.
If you use C#, the code requires adding a reference to System.Windows.Forms.dll for the MessageBox.
Profile the builds that you distribute. If you distribute DEBUG builds, profile it. However, don’t profile while debugging. Hit Ctrl-F5 to launch the code.
When you see the MessageBox, choose main menu->Analyze->Profile->Attach/Detach, then point at the PerfTest.Exe process to which to attach. Once the attach is complete (there’s a spinning circle thingy), dismiss the MessageBox and the targeted code runs. In a few seconds, you’ll see the output in the TextBlock of the main window. Close the main window of the application to end the process and the profiling.
The VS profiler is a sampling profiler. That means, periodically, a snapshot of the process is taken, which includes the call stack for each thread. The data is accumulated until the profiling ends. The stacks are then turned into meaningful data by the profiler.
The VS profiler automatically starts creating the report. In a few seconds, a summary page with a graph of CPU usage vs time shows up. You can select a region and zoom in on it.
The Hot Path (the most expensive call path) is shown, along with the functions doing the most work.
There are several views from which to choose. Call Tree, Modules, Caller/Callee, Functions, etc.
Immediately from the summary page you can see that 50% of the time is in System.Diagnostics.Process.get_Handle()!
Just from those simple steps of profiling the code, you can identify a big improvement in performance.
<Code Sample>
See also: Beginners Guide to Performance Profiling
This is a sample of the Call Tree output:
Function Name |
Inclusive Samples |
Exclusive Samples |
System.Windows.Application.Run() |
193 |
4 |
WpfApplication1.Application.Main() |
193 |
0 |
WpfApplication1.MainWindow.Window1_Loaded(object,class System.Windows.RoutedEventArgs) |
189 |
3 |
WpfApplication1.MainWindow.GetVirtualAllocs() |
184 |
19 |
System.Diagnostics.Process.get_Handle() |
95 |
95 |
System.Diagnostics.Process.GetCurrentProcess() |
33 |
33 |
System.Runtime.InteropServices.Marshal.SizeOf(object) |
22 |
22 |
_VirtualQueryEx@16 |
3 |
3 |
System.UIntPtr..ctor(uint64) |
3 |
3 |
@JIT_NewCrossContext@4 |
2 |
2 |
JIT_NewFast(struct CORINFO_CLASS_STRUCT_ *) |
2 |
2 |
StubHelpers::DemandPermission(class NDirectMethodDesc *) |
2 |
2 |
MarshalNative::SizeOfClass(class ReflectClassBaseObject *,bool) |
1 |
1 |
Microsoft.VisualBasic.Interaction.MsgBox(object,valuetype Microsoft.VisualBasic.MsgBoxStyle,object) |
1 |
1 |
System.Collections.Generic.List`1.Add(!0) |
1 |
1 |
System.Collections.Generic.List`1.Clear() |
1 |
1 |
System.UIntPtr.ToUInt64() |
1 |
1 |
Imports System.Runtime.InteropServices
Imports System.Diagnostics
Class MainWindow
Private WithEvents _btnRefresh As Button
Private _proc As Process
Private _virtAllocs As New List(Of MEMORY_BASIC_INFORMATION)
Private Sub Window1_Loaded(ByVal sender As System.Object, ByVal e As System.Windows.RoutedEventArgs) Handles MyBase.Loaded
Me.Width = 800
Me.Height = 800
Dim sbResult = New Text.StringBuilder
MsgBox("start profile")
_proc = Process.GetCurrentProcess
For i = 1 To 5
Dim nIters = 100
Dim stpwatch = Stopwatch.StartNew
For j = 0 To nIters
_virtAllocs.Clear()
GetVirtualAllocs()
Next
stpwatch.Stop()
Dim res = String.Format("# iters = {0} # msecs = {1}", nIters, stpwatch.ElapsedMilliseconds)
Debug.WriteLine(res)
sbResult.AppendLine(res)
Next
Me.Content = New TextBlock With {.Text = sbResult.ToString}
End Sub
Private Sub GetVirtualAllocs()
Dim mbi As New MEMORY_BASIC_INFORMATION
Dim lpMem As UInt64 = 0
Do While VirtualQueryEx(GetCurrentProcess, New UIntPtr(lpMem), mbi, Marshal.SizeOf(mbi)) = Marshal.SizeOf(mbi)
_virtAllocs.Add(mbi)
lpMem = mbi.BaseAddress.ToUInt64 + mbi.RegionSize.ToUInt64
If mbi.lType Or AllocationType.MEM_IMAGE > 0 Then
'Dim filename = GetFileNameFromMBI(mbi)
'Trace.WriteLine(String.Format("{0:x8} {1}", mbi.BaseAddress.ToUInt32, filename))
End If
Loop
End Sub
<DllImport("kernel32.dll", SetLastError:=True)> _
Public Shared Function GetCurrentProcess() As IntPtr
End Function
Const BlockSize As Integer = 1024
<StructLayout(LayoutKind.Sequential)> _
Structure ProcMemBlock
<MarshalAs(UnmanagedType.ByValArray, sizeconst:=BlockSize)> _
Dim data() As Byte
End Structure
<DllImport("kernel32.dll", SetLastError:=True)> _
Public Shared Function ReadProcessMemory( _
ByVal hProcess As IntPtr, _
ByVal lpBaseAddress As UIntPtr, _
ByRef lpBuffer As ProcMemBlock, _
ByVal dwSize As Integer, _
ByRef lpNumberOfBytesRead As Integer _
) As Integer
End Function
<DllImport("psapi")> _
Shared Function GetModuleFileNameEx(ByVal hProcess As IntPtr, ByVal hModule As UIntPtr, ByVal lpFileName As Text.StringBuilder, ByVal nSize As Integer) As Integer
End Function
<DllImport("kernel32")> _
Shared Function VirtualQueryEx( _
ByVal hProcess As IntPtr, _
ByVal lpAddress As UIntPtr, _
ByRef mbi As MEMORY_BASIC_INFORMATION, _
ByVal dwLength As UInteger) As UInteger
End Function
<StructLayout(LayoutKind.Sequential)> _
Structure MEMORY_BASIC_INFORMATION
Dim BaseAddress As UIntPtr
Dim AllocationBase As UIntPtr
Dim AllocationProtect As AllocationProtect
Dim RegionSize As UIntPtr
Dim State As AllocationState
Dim Protect As AllocationProtect
Dim lType As AllocationType
End Structure
<Flags()> _
Enum AllocationProtect
PAGE_EXECUTE = &H10
PAGE_EXECUTE_READ = &H20
PAGE_EXECUTE_READWRITE = &H40
PAGE_EXECUTE_WRITECOPY = &H80
PAGE_NOACCESS = &H1
PAGE_READONLY = &H2
PAGE_READWRITE = &H4
PAGE_WRITECOPY = &H8
PAGE_GUARD = &H100
PAGE_NOCACHE = &H200
PAGE_WRITECOMBINE = &H400
End Enum
<Flags()> _
Enum AllocationType
MEM_IMAGE = &H1000000
MEM_MAPPED = &H40000
MEM_PRIVATE = &H20000
End Enum
<Flags()> _
Enum AllocationState
MEM_COMMIT = &H1000
MEM_FREE = &H10000
MEM_RESERVE = &H2000
End Enum
Function GetFileNameFromMBI(ByVal mbi As MEMORY_BASIC_INFORMATION) As String
Dim retval = ""
If CType(mbi.lType, AllocationType) = AllocationType.MEM_IMAGE Or True Then
If mbi.AllocationBase.ToUInt64 > 0 Then
Dim sbFilename As New Text.StringBuilder(300)
If GetModuleFileNameEx(Process.GetCurrentProcess.Handle, New UIntPtr(mbi.AllocationBase.ToUInt64), sbFilename, sbFilename.Capacity) > 0 Then
retval = sbFilename.ToString
End If
End If
End If
Return retval
End Function
End Class
</Code Sample>