Share via


Multi threading public folder report

 

Have you ever run 7 TB of data resident in public folders with over 400K folders?

What if you want to generate a report that show some statistics about these folders and the permissions assigned to each folder ? do you imagine how much of time you need to complete that report ? it could take weeks to finish such a report.

What could cause such a delay ?

1- Commands are taking time to process each folder

2- Loop techniques are handling folders in serial mode

But aren’t PowerShell running in .net so why not to run in multithreading ?

Of course we need to make sure that we won’t slow down exchange by this and we won’t be throttled by the system but is it possible ?

The answer is YES if you can split the code in a way to run in parallel mode and to do that let’s see the below example and walk through the stages.

#Stage 0: Some global variables, you can easily change the server name or even loop in all servers from a single location
$Svr = "mydeve14”
$ResultSize = “unlimited”
$Date = Get-Date -DisplayHint Date
$SDate = ($Date).ToShortDateString() -replace(“/”,”-“)

#Stage 1: get all Folder statistics from the server
$Folders = Get-PublicFolderStatistics -Server $Svr -ResultSize $ResultSize -EA 0| select Name,FolderPath, Databasename, CreationTime, Identity, ItemCount, LastUserAccessTime, LastUserModificationTime, OwnerCount, TotalItemSize, AssociatedItemCount, TotalAssociatedItemSize

#Stage 3: We define what kind of information we want to extract from each folder inside the script block, stage 2 will be at the end
$ScriptBlock = {
Param (
$Folders,
[int]$PageSize,
[int]$Offset,
[int]$RunNum,

[string]$conn

)

 

$Svr = "mydeve14”

. 'C:\Program Files\Microsoft\Exchange Server\V14\bin\RemoteExchange.ps1'

connect-exchangeserver "$conn.mylab.dev" -allowClobber

 

#3.1 Define the log name

$Log = “c:\codes\pfstate\PFStats$SDate-$Svr-$RunNum.txt”

 

#3.2 Select folders based on page size and offset
$Folders = $Folders | select-object -skip $Offset -First $PageSize
$PFCnt = $Folders.count #Total number of folders
“Number of Folders: $PFCnt”

 

#3.3 Write the log header
“$Svr Public Folders as of $Date Total number of Folders: $PFCnt” > $Log
$now = [DateTime]::Now
“FolderName`tPath (PFID)`tDatabaseName`tItem Count`tTotal Item Size (KB)`tAssociated Item Count`tAssociated Item Size (KB)`tOldest Content`tNewest Content`tCreation Date`t Last User Access Date`tLast User Modification Date`tClient Permissions” >> $Log #File Header

 

#3.4 Let’s do a Foreach loop to get the item statistics for each folder
foreach($folder in $Folders){
"processing $($folder.name)"
$Db=$Folder.DatabaseName
$PFID = “\” + $folder.FolderPath
$Count = $folder.itemcount
$Size = 0 #TotalItemSize
$Size = $Folder.TotalItemSize.value.toKB()
$CTime = $Folder.CreationTime.ToShortDateString() #Creation Date
$ATime = If ($Folder.LastUserAccessTime){$Folder.LastUserAccessTime.ToShortDateString()}
$MTime = If ($Folder.LastUserModificationTime) {$Folder.LastUserModificationTime.ToShortDateString()}
$AssocItemCt = $Folder.AssociatedItemCount #Attachments?
$AssocItemSize = $Folder.TotalAssociatedItemSize.value.toKB() #Attachment size?
$Contents = $null #Dates of folder content
$Oldest = $null #Date of oldest content
$Newest = $null #Date of newest content

 

#3.4.a check the newest and oldest items
$Contents = Get-PublicFolderItemStatistics -Identity $PFID -Server $Svr | Select CreationTime | Sort CreationTime
If ($Contents) {If ($Items -eq “1”) {
    $Oldest = $Contents.CreationTime | Get-Date -Format dd/MM/yyyy
    $Newest = $Contents.CreationTime | Get-Date -Format dd/MM/yyyy
    } Else {
    $Oldest = $Contents[0].CreationTime | Get-Date -Format dd/MM/yyyy
    $Newest = $Contents[$Items – 1].CreationTime | Get-Date -Format dd/MM/yyyy
    } }

#3.4.b Let’s check the client permissions for each folder
$Perm=$null
Get-PublicFolderClientPermission -Identity $PFID -Server $Svr | Select User,@{Name='AccessRights';Expression={[string]::join(', ', $_.AccessRights)}} | %{$Perm+="$($_.User)#$($_.AccessRights);"}
$Perm.TrimEnd(";")

#3.4.c Output to log file 
“$Name`t$PFID`t$Db`t$Count`t$Size`t$AssocItemCt`t$AssocItemSize`t$Oldest`t$Newest`t$CTime`t$ATime `t$MTime `t$Perm” >> $Log

} #Ending the loop 
Get-Date -Format g >> $Log
} #Ending the script block

# Stage 2: Let’s start the funny part to spawn multiple threads, we create as much threads as we want # to handle 1000 threads, you may need to check PowerShell throttling policy so that you have # no limits for your admin account running this script
#2.1: Set the pagesize and the number of threads we need, we limit the threads to 4 to avoid impacting CPU

$Offset=0;

$numThreads = 3

 

#check if the division has remainder or we will have an int number # +1 as we start with thread 0 and that results on 4 thread 0,1,2,3

[int]$remainder = $Folders.count % ($numThreads+1)

 

#int number to avoid decimals, also change the pagsize to by dynamic based on the number of folders and the threads we intend to run

[int]$PageSize = $Folders.count / ($numThreads+1)

 

#2.2: create the powershell runspace
$myString = "this is session state!"
$sessionState = [System.Management.Automation.Runspaces.InitialSessionState]::CreateDefault()
$sessionstate.Variables.Add((New-Object -TypeName System.Management.Automation.Runspaces.SessionStateVariableEntry -ArgumentList "myString" ,$myString, "session state"))
$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, 4, $sessionState, $host)
$RunspacePool.Open()

#2.3 Start numThreads threads each on a dedicated server and increase the offset

$servers=@("server01","server02”,”server03”,”server04”)
0..$numThreads | % {

    #Increase the last thread with the remainder

    if($_ -eq $numThreads)

    {

        $PageSize = $PageSize + $remainder

    }

      "Start at offset $($Offset) till $($Offset+$PageSize)"

 

    #Due to processing issues, we are running each thread in a dedicated server

    $Conn=$servers[$_]

    "Will connect to $($conn)"

    $Job = [powershell]::Create().AddScript($ScriptBlock).AddParameter("Folders",$Folders).AddParameter("Offset",$Offset).AddParameter("PageSize", $PageSize).AddParameter("RunNum",$_).AddParameter("conn",$conn)
    $Job.RunspacePool = $RunspacePool
    $Job.BeginInvoke()
    $Offset+=$PageSize
}

Upon testing running in multi thread was able to retrieve the results for 26k from a single server in less than 1 hour while the normal foreach loop in a single thread last for more than 7 days and didn’t finish the job.

 

 

Enjoy Smile

 

Ahmed Ashour

Support Escalation Engineer

 

 

# THIS CODE IS SAMPLE CODE. THESE SAMPLES ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. # MICROSOFT FURTHER DISCLAIMS ALL IMPLIED WARRANTIES INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR OF FITNESS FOR # A PARTICULAR PURPOSE. THE ENTIRE RISK ARISING OUT OF THE USE OR PERFORMANCE OF THE SAMPLES REMAINS WITH YOU. IN NO EVENT SHALL # MICROSOFT OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, # BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR OTHER PECUNIARY LOSS) ARISING OUT OF THE USE OF OR INABILITY TO USE THE # SAMPLES, EVEN IF MICROSOFT HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. BECAUSE SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION # OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE LIMITATION MAY NOT APPLY TO YOU.