PowerShell, performance and regular expressions
Yesterday evening, I started to read Windows PowerShell in Action and did confirm what I had already read here and there: it’s a good book!
So this morning, I was playing around and trying some of the stuff I had read:
> Get-Process | Get-Member
TypeName: System.Diagnostics.Process
Name MemberType Definition
---- ---------- ----------
Handles AliasProperty Handles = Handlecount
Name AliasProperty Name = ProcessName
NPM AliasProperty NPM = NonpagedSystemMemorySize
PM AliasProperty PM = PagedMemorySize
VM AliasProperty VM = VirtualMemorySize
WS AliasProperty WS = WorkingSet
add_Disposed Method System.Void add_Disposed(EventHandler value)
add_ErrorDataReceived Method System.Void add_ErrorDataReceived(DataReceivedEventHandler value)
add_Exited Method System.Void add_Exited(EventHandler value)
add_OutputDataReceived Method System.Void add_OutputDataReceived(DataReceivedEventHandler value)
BeginErrorReadLine Method System.Void BeginErrorReadLine()
BeginOutputReadLine Method System.Void BeginOutputReadLine()
CancelErrorRead Method System.Void CancelErrorRead()
CancelOutputRead Method System.Void CancelOutputRead()
Close Method System.Void Close()
CloseMainWindow Method System.Boolean CloseMainWindow()
CreateObjRef Method System.Runtime.Remoting.ObjRef CreateObjRef(Type requestedType)
Dispose Method System.Void Dispose()
Equals Method System.Boolean Equals(Object obj)
GetHashCode Method System.Int32 GetHashCode()
GetLifetimeService Method System.Object GetLifetimeService()
GetType Method System.Type GetType()
get_BasePriority Method System.Int32 get_BasePriority()
…
I did not want to see those [get|set|add|remove]_ … methods! So I try something quick:
Get-Process | Get-Member | ? {$_.Name.IndexOf('_') -eq -1}
But this would leave other members out (e.g. __NounName) so I changed it to:
Get-Process | Get-Member | ? {$_.Name.IndexOf('_') -lt 1}
But I would still miss some members so I went for the more precise stuff:
Get-Process | Get-Member | ? {$n = $_.Name; -not $n.StartsWith('get_') -and -not $n.StartsWith('set_') -and -not $n.StartsWith('add_') -and -not $n.StartsWith('remove_')}
I then decided to send it to an internal distribution list dedicated to PowerShell asking for any help to make it shorter and/or faster. Alex answered with the following:
… | gm | ?{ !($_.Name -match "^(get|set|add|remove)_") }
I replied that it was indeed shorter but slower. Pressing the Send button, I thought “Never say such things without measuring”. But I’m lazy…
17 minutes later, he answered again:
1..100 | measure-command { ls | … } gives 5.34 sec for your version and 4.72 for mine.
If you think that the file system caching might have something to do with those results, try with processes: his solution is still faster.
Thanks Alex!
> 1..100 | measure-command { get-process | gm | ?{ !($_.Name -match "^(get|set|add|remove)_") } }
Days : 0
Hours : 0
Minutes : 0
Seconds : 13
Milliseconds : 353
Ticks : 133534572
TotalDays : 0.000154553902777778
TotalHours : 0.00370929366666667
TotalMinutes : 0.22255762
TotalSeconds : 13.3534572
TotalMilliseconds : 13353.4572
> 1..100 | measure-command { get-process | gm | ? {$n = $_.Name; -not $n.StartsWith('get_') -and -not $n.StartsWith('set_') -and -not $n.StartsWith('add_') -and -not $n.StartsWith('remove_')} }
Days : 0
Hours : 0
Minutes : 0
Seconds : 15
Milliseconds : 536
Ticks : 155367665
TotalDays : 0.000179823686342593
TotalHours : 0.00431576847222222
TotalMinutes : 0.258946108333333
TotalSeconds : 15.5367665
TotalMilliseconds : 15536.7665