共用方式為


Deployment–Troubleshooting PDT

The PowerShell Deployment Toolkit – PDT – performs distributed installations of System Center 2012 SP1, including SQL and all prerequisites.  If you are doing a full production highly available scale-out deployment, this could potentially be across a significant number of servers.  Keeping track of the status of all that is one of the interesting challenges that PDT addresses, but what do you do when something goes wrong?  The good news is that PDT gracefully handles failures mid-flow across that distributed installation, and also allows for restarts of partially failed deployments.

Here’s what happens in the inner workings of PDT.  For each server in a deployment, PDT dynamically determines the set of items that need to be installed and configured and the order in which that needs to happen.  Then, for each item it determines whether that item has already been done using one of a number of validation types.  If it has, it just skips that step.  If it has not, it performs the necessary actions for that item, then re-runs the validation to make sure it worked.  If the validation fails, PDT does not continue for that server – so any items after the failed item are not completed.  If a server has a dependency on an item on another server in the deployment – for example, a management server needs SQL to be installed on another server before it can be installed – it waits for that server to complete that dependency before it continues.  If that server has failed any item prior to the dependency, the server that is dependent at that point also fails.

The result of all this is that, if something fails, you can wait for everything else in the deployment to complete, then fix the condition that caused the failure, then just run Installer.ps1 again.  Everything that worked the first time through will not be done again because PDT will validate that those items are already in place – so it effectively picks up where the failures in the previous run occurred.

So, how can you tell what went wrong?  There are two sets of log files to help you diagnose a failure – the PDT log files themselves, and the log files for the items being installed.  The PDT log files are in the folder C:Users<username>AppDataLocalInstaller on the system running Installer.ps1 – there is a log file per server being deployed to, as well as a consolidated log file Installer.log.  All files are in the format that can easily be read by the CMTrace.exe utility from Configuration Manager (remember I am an old ConfigMgr guy at heart).  The PDT logs list everything PDT is doing – getting information, setting variables, checking if something needs to be installed, creating a task, waiting for that task to finish, waiting for dependencies etc.  If something fails, it will tell you what failed.  The log files for items being installed are collected by PDT and copied to the installer machine at the end of the deployment – they can be found in C:Temp<guid>, guid being a unique identifier that is assigned to each run of the deployment.  These logs are generated by each individual setup, and so each has their own format.  PDT collects them so that they are easy for you to find, but also because the way PDT runs tasks against remotes machines means that some of the log files get deleted as part of the process, and we need to make sure you have access to them.

So, that’s how to start troubleshooting failures, plus a little insight into how PDT actually works.  More on that in future posts!

Comments

  • Anonymous
    January 01, 2003
    @Davey- validation was successful, thanks for pointing out the issue with $ in the password. @Rob- Installation fails on installing any/all SQL instances. I manually went into each server and installed .NET Framework 3.5 and attempted to install once again, same results. I have confirmed and saved off logs from each installation attempt. Thinking about wiping it all and starting from scratch.

  • Anonymous
    January 01, 2003
    Thanks Nathan!  We'll look at that and incorporate as appropriate in next release.

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    Rob, the machine I used does have both .NET 3.5 and .NET 4.5 installed. Neal

  • Anonymous
    January 01, 2003
    @Rob- In reviewing the Installer.log file, I did see where the SQL source files were trying to copy from C:Temp. As I had my installation and downloaded files under D:Temp, there was the disconnect. I have moved my files from C to D, and I am seeing more progress now that I have restarted the installer.ps1 again. (Have not started from scratch yet...) Neal

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    January 01, 2003
    @Davey. Doh, yes, my passwords have the $ in them, completely forgot about PS barking about that. I will be changing the passwords and retrying the validation again. Thanks!! Neal

  • Anonymous
    January 01, 2003
    wnbowman - can you confirm that a version of the .NET Framework is installed on the machine you are running Installer.ps1 from?  Either 3.5 or 4.5.  The script definitely needs to be run from a domain joined machine.

  • Anonymous
    January 01, 2003
    The comment has been removed

  • Anonymous
    April 04, 2013
    The comment has been removed

  • Anonymous
    June 20, 2014
    The comment has been removed