Working with Deleted Build Data in Team Foundation Server 2010
Introduction
In Team Foundation Server 2008 you could delete builds both from the Build Explorer, using the context menu on one or more builds, and from the command line, using TfsBuild.exe. Also, retention policy would automatically delete builds as specified. Once you deleted the build, you no longer were able to see the build in the Build Explorer, as it no longer "exists", and TfsBuild.exe's delete command will complain that the build doesn't exist. In Team Foundation Server 2010, you may be surprised to find that, although you don't see the build in Build Explorer, your data still exists in the database. Well, some of it.
There is more to the story. Another new feature in TFS 2010 is that you can specify what gets deleted when you delete a build, including the drop location (for binaries and other outputs), the version control label, test results, symbols (on a symbol server), and the build details (the actual database records related to the build). Until you delete the build details, your build will remain visible in Build Explorer. This article addresses actions taken after the build detail is deleted.
There are several reasons for TFS 2010 to keep build data around after the user deletes it, which I'll address in the following sections.
Build Information that Doesn't Get Deleted
So, when you delete the build details of a build, what doesn't get deleted from the database?
- The build's detail record. This gets a flag set on it that denotes it as a "deleted" build, but is otherwise intact.
- The build's top level information nodes and some specific node types. There are several pieces of information that are stored in information nodes that are critical for post-deletion activities. A few are:
- Associated changesets, used for warehouse rebuilding.
- Configuration summaries and build projects, used for warehouse rebuilding.
- Symbol store transaction information, used to delete a build’s symbols.
- The build's outputs. These are used to query build information by binary signatures of an assembly or native image.
- The build's workspace. Can be queried to reproduce a build or for informational purposes.
You may wonder how much space this uses per build. Well, it will vary greatly depending on the size and complexity of your project, so I won't try to answer that here. I'll try to do some analysis and give some rough estimates in a later post.
Rebuilding the Warehouse
The warehouse is where reporting data is archived in order to keep historical data as well as to keep the production databases smaller/cleaner for performance and efficiency reasons. I won't go into any details about how or why you may rebuild the warehouse, since I don't have the expertise in that area.
Deleting Remaining Build Artifacts
Let's say you've been deleting builds via retention policy, but you haven't been deleting the build symbols. This may be a useful scenario if you have builds installed or lingering around for longer periods of time than you want to keep builds for. But eventually, your symbol store will become very bloated, so you'll want to clean it up. One method to clean it up is to view the logs in the store and decide which symbols to delete that way. Another way is to let TFS do it for you. You can specify on the command line:
tfsbuild.exe delete /server:https://tfsat:8080/collection0 /builddefinition:"Adventure Works\Nightly Build" /daterange:~2008-12-31 /deleteoptions:Symbols
This particular command will connect to TFS at https://tfsat:8080/collection0 (/server:https://tfsat:8080/collection0) and delete symbols (/deleteoptions:Symbols) for all builds older than December 31, 2008 (/daterange:~2008-12-31) in the build definition called "Nightly Build" on the team project "Adventure Works" (/builddefinition:"Adventure Works\Nightly Build"). There are several variations to the delete command, which you can explore in documentation or via command line help. The /deleteoptions parameter takes one or more of:
- Details
- Label
- DropLocation
- TestResults
- Symbols
- All
To combine multiple, you can separate them by a comma, like:
/deleteoptions:Details,DropLocation
Remember, you can delete any artifacts (that haven't been previously deleted) even after you delete the build details. That is because a limited amount of data is still stored in the database, including everything needed to delete all the artifacts.
Matching Binaries to a Build
The concept here is that you can query TFS to ask it which build created a particular binary. This is a not-well-documented feature, but it can be pretty useful. The way this works is that during a build, there is an activity that associates build outputs to the build, called AssociateBuildOutputs. What it does is read the PE (Portable Executable) image headers of a managed assembly or native binary and extracts a couple of identifying pieces of information. It stores those in an indexed table of the database. Then, when you have a binary, but for whatever reason don't know specific information about it's version/build, you can read the same information from the PE headers and query the database. This can be done for any build that exists or that has already been deleted (even if the build details were deleted).
Unfortunately, there is no tool that supports this from the command line or Visual Studio. You'll have to write some code, but it isn't very hard. The TFS Build client libraries will do the header reading for you, so don't worry. Here is an example:
using Microsoft.TeamFoundation.Build.Client;
static IBuildDetail GetBuildForBuildOutput(IBuildServer buildServer, string outputPath)
{
// Return values of the GetSignature function
Guid guid;
int age;
// GetSignature returns false if it failed
// to read the executable file
if (!ModuleSignatureReader.GetSignature(outputPath, out guid, out age))
{
return null;
}
// GetBuildForBuildOutput returns null if
// an appropriate build isn't found
return buildServer.GetBuildForBuildOutput(guid, age);
}
All we are doing here is wrapping a couple functions so we can query by the assembly's path. ModuleSignatureReader.GetSignature will open the file, so it needs IO read access to it. It returns true for success and false for failure. IBuildServer.GetBuildForBuildOutput then calls out to TFS to do the query and return the result. It returns null if a build can't be found for the output.
Getting a Build's Workspace Mappings
TFS Build needs a particular workspace setup to build, just like any dev working in the source code. Often this will be the same as any dev's workspace, but sometimes it will include more or less in order to satisfy specific build requirements (such as building a complete set of products which depend on each other on a nightly basis, or building just a piece of the source tree for every checkin). You can query for the particular workspace mappings used from any build.
You can do this for any build normally, but since this article focuses on deleted builds, we'll be looking for a deleted build and then getting the workspace mappings. There are only a couple methods that take a QueryDeletedOption. The IBuildServer.QueryBuildsByUri function takes it, and any functions that take IBuildDetailSpec do, by way of the IBuildDetailSpec.QueryDeletedOption property. Here is what that QueryDeletedOption enum looks like:
public enum QueryDeletedOption
{
ExcludeDeleted = 0,
IncludeDeleted = 1,
OnlyDeleted = 2,
}
Keep in mind that when we talk about "deleted" builds, we really mean the build details (database records). That still applies with this enumeration. ExcludeDeleted means don't include any deleted builds in the results. IncludeDeleted means go ahead and query deleted builds as well as not-deleted builds. OnlyDeleted means just what it says, only query for deleted builds.
Again, the command line and Visual Studio client don't have access to this data. You'll have to write code:
// Query the desired build, using QueryDeletedOption.Deleted
IBuildDetail build = buildServer.QueryBuildsByUri(new Uri[] { buildUri }, null, QueryOptions.None, QueryDeletedOption.Deleted);
// Get the workspaces used by the build
IWorkspaceInstance[] workspaces = build.GetWorkspaceInstances();
// Iterate through the workspaces
foreach (IWorkspaceInstance workspace in workspaces)
{
Console.WriteLine("Workspace:");
foreach (IWorkspaceInstanceMapping mapping in workspace.Mappings)
{
Console.WriteLine("[{0}] {1}", mapping.MappingType, mapping.ServerItem);
}
}
Note that there are circumstances where multiple workspaces may have been used by a build (likely a custom build) so you may have multiple workspace instances.
Destroying Builds
Note: Destroying builds is not for the faint of heart. Once a build is destroyed it cannot be recovered.
Destroying builds is about cleaning up your database. I believe in 9 out of 10 cases there is no need to destroy builds … ever. However, if your projects do a lot of builds, destroying old builds can save space and even increase performance.
When you destroy builds, you are not deleting build artifacts, such as version control labels and test results. Destroying a build permanently deletes the information from the database. Typically, if you are destroying builds, you will want to delete all the build artifacts prior to doing so. If you intend to keep the build artifacts, that is fine, too.
From the command line, you can destroy builds individually, by date range, or a couple other ways. See the documentation or command line help for more information. I envision the destroy command being used to destroy old builds. You may try something like:
tfsbuild.exe delete /server:https://tfsat:8080/collection0 /builddefinition:"Adventure Works\Nightly Build" /daterange:~2008-12-31 /deleteoptions:All
tfsbuild.exe destroy /server:https://tfsat:8080/collection0 /builddefinition:"Adventure Works\Nightly Build" /daterange:~2008-12-31
Of course, the delete command is optional. I use it to delete all the build artifacts prior to destroying the builds. These command lines are very similar to the one earlier in this article. Basically, we are deleting all build artifacts and then destroying the database records for builds older than December 31, 2008 for the "Nightly Build" build definition in the "Adventure Works" team project.
You can also destroy builds from the TFS Build client libraries. You'd do it something like this:
// Create a build detail spec to query all deleted
// builds prior to December 31, 2008
IBuildDetailSpec spec = buildServer.CreateBuildDetailSpec(definition);
spec.MaxChangedTime = DateTime.Parse("2008-12-31");
spec.QueryDeletedOption = QueryDeletedOption.IncludeDeleted;
// Query the builds
IBuildDetail[] builds = buildServer.QueryBuilds(spec).Builds;
// Delete any remaining artifacts for all the builds
// (Note: this can take a long time, so do it one at a time)
foreach (IBuildDetail build in builds)
{
build.Delete(DeleteOptions.All);
}
// Now destroy the builds
buildServer.DestroyBuilds(builds);
Again, the buildServer.DeleteBuilds call is optional, and is just used to delete all the build artifacts prior to destroying the build data.
Conclusion
As you can see there are several things to do with deleted builds. A lot of them require getting down and dirty with the command line or custom coded tools, but there is nothing that is terribly difficult to manage. The new /daterange option for the delete and destroy commands of TfsBuild.exe will be a nice addition to those of you that like to use the baked tools, but I recommend PowerShell scripts to give you better control of your scheduled maintenance tasks.