How to Convert Text Output of a Legacy Console Application to PowerShell Objects
Introduction
[[Windows PowerShell]] is great tool when you work with cmdlets and objects, but sometimes you need to use legacy console applications. May be sometime most of this applications will have a native PowerShell analog, but now we need to use what we have. But we can improve output of this applications and convert it to objects, to use it in PowerShell more comfortable. There is some examples of such conversions.
One line for each object
Lets take for example one of netsh.exe subcommands:
PS: c:\> netsh interface ipv6 show
route Publish Type Met Prefix Idx Gateway/Interface Name ------- -------- --- ------------------------ --- ------------------------ No Manual 8 ::/0 13 Local Area Connection* 18 No Manual 256 ::1/128 1 Loopback Pseudo-Interface 1 No Manual 8 2001::/32 13 Local
Area Connection* 18 No Manual 256 fe80::/64 14 Main connection No Manual 256 fe80::/64 18 Bluetooth Network Connection No Manual 256 fe80::/64 16 Wireless Network Connection No Manual 256 fe80::/64 19 IPHTTPSInterface No Manual 256 fe80::/64 13 Local Area
Connection* 18 No Manual 256 fe80::ffff:ffff:fffe/128 13 Local Area Connection* 18 No Manual 256 fe80::1ef:dff1:20fb:abb2/128 16 Wireless Network Connection No Manual 256 fe80::40f7:a4f6:1190:e6/128 14 Main connection No Manual 256 fe80::559c:9377:5fc6:2256/128
18 Bluetooth Network Connection No Manual 256 fe80::b511:73bb:2598:dff8/128 19 IPHTTPSInterface
It looks like table of objects, but it is not. It just lines of raw text and you cant sort it by metric, filter by prefix or interface name. We need to convert this text to objects manually.
First we should get rid of column headers and separator ("----") line. They take a fixed amount of lines from start of output, so it can easly be done. We also need to remove one empty line at end.
001 002 |
$output = netsh interface ipv6 show route $output = $output[3..($output.length-2)] |
Now in variable we have only lines that represent routes. We need to split each of this lines to 6 separate substrings (by columns count) by one or more spaces.
001 |
$output | foreach {$parts = $_ -split "\s+", 6} |
And the final step - combine this two procedures and create PowerShell objects using New-Object cmdlet. To populate properties we will take elements of $parts array. Finally put all this into a PowerShell function:
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 |
Function Get-IPv6Route { $output = netsh interface ipv6 show route $output = $output[3..($output.length-2)] $output | foreach { $parts = $_ -split "\s+", 6 New-Object -Type PSObject -Property @{ Publish = ($parts[0] -eq "Yes") Type = $parts[1] Metric = [int]($parts[2]) Prefix = $parts[3] Idx = $parts[4] InterfaceName = $parts[5] } } } |
Now it is possible to use this object in usual PowerShell ways:
001 002 003 |
Get-IPv6Route | Sort-Object metric | Format-Table -AutoSize Get-IPv6Route | Group-Object -Property idx | sort -Property count | Format-Table Name, Count -AutoSize Get-IPv6Route | Where-Object {$_.InterfaceName -like "Bluetooth*"} |
Each line as a property
For next example lets take another netsh.exe subcommand:
PS C:\> netsh interface ipv4 show
tcpstats MIB-II TCP Statistics ------------------------------------------------------ Timeout Algorithm: Van Jacobson's Algorithm Minimum Timeout: 10 Maximum Timeout: 4294967295 Maximum Connections: Dynamic Active Opens: 22166 Passive Opens: 141 Attempts Failed:
8531 Established Resets: 4756 Currently Established: 62 In Segments: 2630476 Out Segments: 3746364 Retransmitted Segments: 18219 In Errors: 0 Out Resets: 19249
This time we need to create one object where properties will correspond to lines of output. There is a lot of properties, and adding each of them to object can be mundane work. But each of property lines follow a same pattern: property name, then colon, some spaces and value. So it is possible to use regular expression to parse this lines. There it is: ^([^:]+):\s*(\S.*)$ It means that we capture anything till first colon in first capture group, then skip colon and some number of spaces, after that, text from first non-space character till end of line is captured in second group. Now we can use this regular expression on output, and add properties to object from each match using Add-Member cmdlet:
001 002 003 004 |
$Output = netsh interface ipv4 show tcpstats $Object = New-Object -Type PSObject $Output | Where {$_ -match '^([^:]+):\s*(\S.*)$'} | Foreach {$Object | Add-Member -Type noteproperty -Name $Matches[1] -Value $Matches[2]} |
Almost done, but we have two more problems left. First - Property names contains spaces. PowerShell allows this, but to use such properties you need to enclose them in quotes everytime. In this case we can easily remove spaces from property names using -Replace operator. Second problem - most of the values should be integers but they are just strings containing digits. We should try to convert them into integers when it possible. Best method for this - is use static function of integer class - TryParse. Final function looks like this:
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 |
Function Get-IPv4TcpStats { $Output = netsh interface ipv4 show tcpstats $Object = New-Object -Type PSObject $Output | Where {$_ -match '^([^:]+):\s*(\S.*)$' } | Foreach { [int]$ParseResult = 0 if ([int]::TryParse($Matches[2], [ref]$ParseResult)) { $Value = $ParseResult } else { $Value = $Matches[2] } $Name = $Matches[1] -replace ' ' $Object | Add-Member -Type NoteProperty -Name $Name -Value $Value } Write-Output $Object } |
Objects and properties spanning on multiple lines
In last example lets see how to deal with text output which contains many objects, this objects span on several lines, and even some properties span on more than one line. One of utilities that provide output in this way - ipconfig. Lets view some of it's output (I skipped some of lines to save space):
Windows IP Configuration
Host
Name . . . . . . . . . . . . : Alpha Primary Dns Suffix . . . . . . . : lab.local DNS Suffix Search List. . . . . . : lab.local lab
Ethernet adapter Local Area Connection:
Connection-specific DNS Suffix . : lab.local IPv4 Address. . . . . . . . . .
. : 192.168.1.50(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.1.1 DNS Servers . . . . . . . . . . . : 192.168.1.102 192.168.1.103 NetBIOS over Tcpip. . . . . . . . : Enabled
Tunnel adapter isatap.{946ECB18-56DC-4789-AA67-CEFE4F5EB648}:
Media
State . . . . . . . . . . . : Media disconnected Description . . . . . . . . . . . : Microsoft ISATAP Adapter Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes
Its looks like a pretty hard case. Names of adapters on separate lines, and some properties (like DNS servers and suffix list) are also continued on next lines (without repeating property name). But it is pretty parseable.
First we need to determene types of lines, how this types differentiate from others, and what this types mean.
First type is adapter name. It is easy to detect because only adapter names starts from the beginning of the line. All other types are prefixed with some spaces. Lets ignore "Windows IP Configuration" for now, it not an adapter but for our approach this does not matter.
Second type is property name and value. It consists of name of property followed by dots and spaces and then colon. After the colon to end of line follows a value.
Third type is second (third, and following) value of property. It does not contain property name and hard to detect, but we can drop here anything other than first and second types.
After we decided what types we need to parse, we need to create overal strategy of parsing:
Script will remove all empty lines from output.
Script then will parse all output line by line comparing to regular expression patterns to select correct type of line.
If line is adapter name, script will create new object (New-Object) and place it in $CurrentObject variable (if there is already an object in this variable, script will write it to output). Also if line contain word "adapter" script will take text before this word and place it in "Type" property, and text after the "adapter" to "Name" property of this object.
If line is property and value, script will add this property to $CurrentObject with its value using Add-Member. Also property name is placed to $CurrentProperty variable
If line contains just property value, then script will take contents of property of current object specified in $CurrentProperty, append new value to it, and put its back on object as array.
Finally we add switch parameter to script, to specify should it output information about adapters (everething except first result) or only global information (first result only).
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 |
function Get-IPConfig ([switch]$Global) { #Receiving output $Output = ipconfig /all #Removing empty lines $Output = $Output | Where {$_} #Parsing each line using regular expressions $Result = $(switch -regex ($Output) { #First character in line not a space - adapter name '^\S' { #If $CurrentObject not an empty, then output object from it to pipeline if ($CurrentObject) {Write-Output $CurrentObject} #Create new object and put it into $CurrentObject $CurrentObject = New-Object -Type PSObject #If line contains "adapter"... if ($_ -match '^(.+) adapter (.+):') { #Add adapter name and type properties $CurrentObject | Add-Member -type noteproperty -Name "Name" -Value $Matches[2] $CurrentObject | Add-Member -type noteproperty -Name "Type" -Value $Matches[1] } } #Property name, dots and spaces, colon, value name '^\s+(\S[^.]+?)[.\s]+:(?: (.+\S))?\s*$' { #Remove spaces from property name $CurrentProperty = $Matches[1] -replace ' ' $CurrentObject | Add-Member -type noteproperty -name $CurrentProperty -Value $matches[2] } #Remaining values, not start with space, dont contain colon '^\s+(\S[^:]*?)\s*$' { #Add value to property as array $CurrentObject.$CurrentProperty = @($CurrentObject.$CurrentProperty, $Matches[1]) } }) #Check -Global switch if ($Global) { #Output only global information (first object) Write-Output $Result[0] } else { #Output adapter information (everything except first object) $Result | Select-Object -Skip 1 | Write-Output } } |