Jaa


Versioning NET4 Workflow Services in Windows Server AppFabric

There is tremendous interest in versioning .NET4 Workflows. This blog post provides guidance on common scenarios that can be supported.

 

Background

Workflow versioning is required by a change in a business process that causes a modifications in the implementation of a .NET4 Workflow Service and this may be change may need to be shielded from or  is not evident to the client.  In other words, existing clients (already deployed) can continue to communicate with the Service without any changes.  It is entirely up to the Server to decide whether instances of old and new implementations should exist side by side, whether existing instances should be upgraded to the new Definition, and how to route client requests to instances of different implementations.

 

When a new Workflow Definition is deployed, there is a choice on what to do with existing Instances of the old Definition(s). The first choice is to terminate existing Workflow Instances and the second one is to phase out the Instances. 

 

Terminating existing Workflow Instances is kind of extreme and this approach is only viable if the Workflow definition contains a catastrophic flaw or the business process is totally outdated therefore existing Instances cannot be allowed to continue. Phasing out Instances after they run their due course is more prevalent need since existing Workflow Instances remain unaffected and continue with their original definitions.  Over time all of these Instances complete and the old definition can be discarded. New instances will always be started with new definitions.

 

Before we proceed further lets recap on an the requirement that each Workflow Definition is hosted by a single Workflow Service Host (WSH).  And in our case, since Instances of multiple Definitions must run side-by-side, a corresponding number of WSHs must be created.  However these WSHs may share a single process or be divided between any numbers of processes depending on the load patterns.

 

This blog will focus on the ‘Phasing out Instances’ approach. This approach requires a logical layer between the Client and Server (with the WSH) that will interrogate the Client message and route it to the appropriate Workflow Service Host that is hosting the Workflow Definition.

 

The new WCF Routing Service is used in this scenario. The Routing Service is placed in front of all (WSH) Hosts and routes incoming messages to the corresponding Hosts.  All Clients send messages to the endpoint exposed by the Routing Service.

Requirement

The requirements for this scenario are summarized as:

· New instances should always launch the Instance with the latest version of the Workflow Definition.

· Clients of in flight processes should be able to reach the right version of the Workflow Definition.

Assumption

Clients are agnostic of the Workflow versioning, and once deployed they do are or cannot be modified when a new Workflow version is deployed to the server.

‘Layback Routing’ - Design Approach

The design approach, named in this blog as the ‘Layback Routing’ is based off customized routing logic that processes correlation errors. While a correlation fails on latest version; the logic recursively retries down a list of older versions until the correlation is successful or all correlations are exhausted. At this point the correlation error is returned to client.

Caveat

The solution and design approach is suitable for long running Workflows and should be used in low/medium invocation scenarios. Bottom-line invocation performance should not be a constraint. For very higher volume processing other designs should be pursued. 

Solution

The Routing solution elaborated here is a simple customization of the WCF Routing Service.

 

The WCF Routing Service (System.ServiceModel.Routing) is a general purpose, programmable router that is a handy intermediary in a number of different situations, especially around managing communications between Clients and Services. The Routing Service is programmable via an API or by web.config. One of the very handy features of the Routing Service that Layback Routing utilizes is the backup list. This is a list of service file endpoints that will act as a backup for when a Service’s primary endpoint cannot be found or is unavailable.

 

A message that arrives from a Client causes a Persisted Instance to resume. This Instance is expecting to communicate with version 1 of the Service file, but when it tries to do so it reaches the newer version - 2 thereby causing a correlation fault to occur. While the Routing Service is not built to catch correlation faults, the custom endpoint behavior added to the Routing Service will catch this fault, and throw an EndpointNotFound exception. This type of exception will be caught by the WCF Routing Service, and will attempt to match this instance with one of the subsequent endpoints listed in this Service file’s backup list. All of the older versions of this Service file are contained in this list, so one of them should match with the Instance, and execution will then be able to resume.

Custom Behavior applied to Web.Config

The custom endpoint behavior mentioned here is a ClientMessageInspector. Implementing this custom behavior requires three files:

1. BehaviorExtensionElement - Sets up the custom behavior, this is the class used from the web.config of the Routing Service.

2. IEndpointBehavior - Manages the message channel binding parameters.

3. IClientMessageInspector - Acts on the message.

 

This code below shows where to add the reference to this custom behavior to the Routing Service’s web.config file:

    1: <system.serviceModel>
    2: ...
    3: <behaviors>
    4: ...
    5: <endpointBehaviors>
    6: <behavior>
    7: <persistenceFaultToClientException />
    8: </behavior>
    9: </endpointBehaviors>
   10: </behaviors>
   11:  
   12: <extensions>
   13: <behaviorExtensions>
   14: <add name="persistenceFaultToClientException" type="ServiceExtensions.PersistenceFaultToClientExceptionBehaviorElement, ServiceExtensions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />
   15: </behaviorExtensions>
   16: </extensions>
   17: ...

 

File ClientMessageInspector extends the behavior of the IClientMessageInspector. Function AfterReceiveReply inspects the message after it has been sent to its intended destination. If a fault has occurred and if this fault contains wording indicating that it is a correlation fault, we assume that this is due to a mismatched version and throw an EndpointNotFoundException. The Routing Service will then try this instance with the endpoints in the backup list and hopefully find a matching version. If in fact this correlation error is due to a more serious condition, the Routing Service will still attempt to connect the instance with the endpoints in the backup list, but they will all fail and the client will see the error message.

    1: using System.ServiceModel.Dispatcher;
    2: using System.ServiceModel;
    3:  
    4: namespace ServiceExtensions
    5: {
    6:     public class PersistenceFaultToClientExceptionInspector : IClientMessageInspector
    7:     {
    8:         public void AfterReceiveReply(ref System.ServiceModel.Channels.Message reply, object correlationState)
    9:         {
   10:             if (reply.IsFault)
   11:             {
   12:                 string faultText = reply.ToString();
   13:  
   14:                 if (faultText.Contains("contained incorrect correlation data"))
   15:                 {
   16:                     throw new EndpointNotFoundException("Correlation failed, try another service version.");
   17:                 }
   18:             }
   19:         }
   20:         public object BeforeSendRequest(ref System.ServiceModel.Channels.Message request, System.ServiceModel.IClientChannel channel)
   21:         {
   22:             return null;
   23:         }
   24:     }
   25: }

AppFabric Dashboard Behavior

If we were to use the behavior as is, the Dashboard in AppFabric would have an error indicating that the persistenceFaultToClientException element is not recognized. In order to avoid this error, we need to place a schema file in %windows%\system32\inetsrv\config\schema.

    1: schema file Service_Extensions_Schema.xml
    2: <configSchema>
    3:     <sectionSchema name="system.serviceModel/behaviors">
    4:     <element name="endpointBehaviors">
    5:       <collection addElement="behavior" removeElement="remove" clearElement="clear" allowDuplicates="true">
    6:         <element name="persistenceFaultToClientException"/>
    7:       </collection>
    8:     </element>
    9:   </sectionSchema>
   10: </configSchema>

With the ClientMessageInspector in place, we can rely on the Routing Service to route messages to older versions of the service file when needed. Because this versioning and routing work takes place on the server after each publish, we have a custom provider on the server’s instance of Web Deploy that will accomplish this task.

So, with the project published to the Server and the enabled protocols set - the Client application calls the custom provider in order to perform the Service file versioning and routing configuration updates. The source and destination base options both point to the IIS Server. The provider only needs to know the website and application name for these updates. This is set in the destination’s DeploymentProviderOptions.Path variable. The custom provider we use is named versionedPublish.

    1: DeploymentProviderOptions sourceProviderOptions = new DeploymentProviderOptions("versionedPublish");
    2:  
    3: providerOptions.Path = "";
    4: DeploymentObject deploymentObject = DeploymentManager.CreateObject(
    5: sourceProviderOptions,
    6: destinationBaseOptions);
    7:  
    8: DeploymentProviderOptions destinationOptions = new DeploymentProviderOptions("versionedPublish");
    9:  
   10: destinationOptions.Path = "Default Web Site/WorkflowApplication";
   11:  
   12: deploymentObject.SyncTo(destinationOptions,
   13: destinationBaseOptions, syncOptions);

Creating a Web Deploy Custom Provider

This Custom Provider’s assembly (dll) resides in the Extensibility folder of Web Deploy on the IIS Server. The client must also have this Custom Provider dll in its own Web Deploy Extensibilities folder, otherwise Web Deploy won’t recognize the Provider being called and return an error instead.

Note that you need to create the Extensibilities folder as it is not created upon installation of Web Deploy (location: %program files%\IIS\Microsoft Web Deploy\Extensibilities).

To create a Custom Provider, we use a class library project in Visual Studio 2010 and build it with a target framework of .NET 3.5. Web Deploy won’t yet recognize a provider written in .NET 4.0. Also, the project must reference Microsoft.Web.Deployment.dll and Microsoft.Web.Delegation.dll, both found in the Web Deploy folder. A Custom Provider consists of two files, a DeploymentObjectProvider and a DeploymentProviderFactory.

The DeploymentProviderFactory provides an interface for the Web Deployment Agent to interact with the Custom Provider. Note that both the Name and FriendlyName functions return the name of the Custom Provider.

    1: File VerisonedPublishProviderFactory.cs
    2: ...
    3: using Microsoft.Web.Deployment; 
    4: namespace Providers.WebDeployUtilities
    5: {
    6:     [DeploymentProviderFactory]
    7:     public class VersionedPublishProviderFactory : DeploymentProviderFactory
    8:     {
    9: protected override DeploymentObjectProvider Create(DeploymentProviderContext providerContext, DeploymentBaseContext baseContext)
   10:        {
   11:             return new VersionedPublishProvider(providerContext, baseContext);
   12:        }
   13:  
   14:         public override string Description
   15:         {
   16:             get { return @"Custom provider for versioning published files."; }
   17:         }
   18:  
   19:         public override string ExamplePath
   20:         {
   21:             get { return @"Destination Web Site/ApplicationName"; }
   22:         }
   23:  
   24:         public override string FriendlyName
   25:         {
   26:             get { return "versionedPublish"; }
   27:         }
   28:         public override string Name
   29:         {
   30:             get { return "versionedPublish"; }
   31:         }
   32:     }
   33: }

Next up is the Custom Provider. Just after the class declaration are two lines that define the name of the custom provider as well as the name of the provider’s key attribute. Function GetAttributes is used to determine whether the provider is looking at the source or the destination. When the provider is at the destination, a DeploymentException is thrown which ends up calling the Add function.

    1: File VersionedPublishProvider.cs
    2:  
    3: using Microsoft.Web.Deployment;
    4: using System.IO;
    5: using System.Diagnostics;
    6: using System.Text.RegularExpressions;
    7: using System.Xml.Linq;
    8: using Microsoft.Web.Administration;
    9: using System.Web.Routing;
   10: using System.Net;
   11: using System.Web;
   12:  
   13: namespace Providers.WebDeployUtilities
   14: {
   15:     public class VersionedPublishProvider : DeploymentObjectProvider
   16:     {
   17:  
   18:         internal const string ObjectName = "versionedPublish";
   19:  internal const string KeyAttributeName = "path";
   20:  
   21:  
   22:         public VersionedPublishProvider(DeploymentProviderContext providerContext, DeploymentBaseContext baseContext)
   23:             : base(providerContext, baseContext)
   24:         {
   25:             this.FilePath = providerContext.Path;
   26:         }
   27:  
   28:  
   29:         protected internal string FilePath
   30:         { get; set; }
   31:  
   32:  
   33:         #region DeploymentObjectProvider members
   34:         public override void GetAttributes(DeploymentAddAttributeContext addContext)
   35:         {
   36:             if (this.BaseContext.IsDestinationObject)
   37:             {
   38:                 throw new DeploymentException();
   39:             }
   40:             else
   41:             {
   42:  
   43:             }
   44:  
   45:             base.GetAttributes(addContext);
   46:         }
   47:  
   48:  
   49:         public override DeploymentObjectAttributeData CreateKeyAttributeData()
   50:         {
   51:             DeploymentObjectAttributeData attributeData = new DeploymentObjectAttributeData(
   52:                 VersionedPublishProvider.KeyAttributeName,
   53:                 this.FilePath,
   54:                 DeploymentObjectAttributeKind.CaseInsensitiveCompare);
   55:  
   56:             return attributeData;
   57:         }
   58:  
   59:  

The Add function (below) is called when it is time to act on the destination. The purpose of our Custom Provider here is to create new versions of the service files and modify the Routing configuration as described above. The following code is a summary of the functions involved. A complete version of this code is available for download along with this blog posting.

 

The first step in the process is to take the data passed in by destinationProviderOptions.Path when this Custom Provider is called. From inside this Provider, this data is in variable this.FilePath. We separate this string into website and application and go to work. For your reference, the source’s DeploymentProviderOptions.Path value would be accessed with source.ProviderContext.Path.

Next, function findAppLocation uses ServerManager (Microsoft.Web.Administration) to get the physical location of the folder for this IIS Application. With this location handy, findFilesNoVersions discovers all of the service files in this application. For each service file, a versioned copy is created and ServerManager is used to update Layback Routing’s configuration file.

    1: public override void Add(DeploymentObject source, bool whatIf)
    2:         {
    3:             if (!whatIf)
    4:             {
    5:           
    6:                 string siteNameAppName = this.FilePath;
    7:                 string[] siteAndAppArray = siteNameAppName.Split(new string[] { "/" }, StringSplitOptions.RemoveEmptyEntries);
    8:                 string siteName = siteAndAppArray[0];
    9:                 string appName = siteAndAppArray[1];                
   10:  
   11:   string correctAppPath = findAppLocation(siteName, appName);
   12:                 List<string> fileNamesNoVersions = new List<string>();
   13:                 string versionKey = "_wfs";
   14: 
   15: 
   16:                 findFilesNoVersions(correctAppPath, fileNamesNoVersions);
   17:  
   18:  
   19:                 foreach (string fileNameNoVersion in fileNamesNoVersions)
   20:                 {
   21:                     string justName;
   22:                     int highVer;
   23:                     determineNextVersion(correctAppPath, fileNameNoVersion, out justName, out highVer);
   24:                     createVersionedCopy(correctAppPath, versionKey, justName, highVer);
   25:  
   26:  
   27:                     string routerName = "LaybackRouting";
   28:                     using (ServerManager serverManager2 = new ServerManager())
   29:                     {
   30:                         string appNameJustName = appName + justName;
   31:                         appNameJustName = appNameJustName.Replace(' ', '_');
   32:                         string appNameJustNameAddress = HttpUtility.UrlPathEncode(appName) + "/" + HttpUtility.UrlPathEncode(justName);
   33:                         routerName = HttpUtility.UrlEncode(routerName);
   34:                         Configuration routerConfig = serverManager2.GetWebConfiguration(siteName, "/" + routerName);
   35:                         ConfigurationElementCollection clientCollection = addClientEndpointWithoutVersion(appNameJustName, appNameJustNameAddress, routerConfig);
   36:                         addClientEndpointWithVersion(versionKey, highVer, appNameJustName, appNameJustNameAddress, clientCollection);
   37:                         ConfigurationSection routingSection;
   38:                         ConfigurationElementCollection backupListsCollection;
   39:                         ConfigurationElement listToCreate;
   40:                         findBackupList(appNameJustName, routerConfig, out routingSection, out backupListsCollection, out listToCreate);
   41:                         if (listToCreate != null)
   42:                         {
   43:                             addBackupListEndpoint(versionKey, highVer, appNameJustName, listToCreate);
   44:                             findAndEditLaybackTable(versionKey, highVer, appNameJustName, routingSection);
   45:                         }
   46:                         else
   47:                         {
   48:                             makeBackupListAddEndpoint(versionKey, highVer, appNameJustName, backupListsCollection);
   49:                             findAndEditLaybackTable(versionKey, highVer, appNameJustName, routingSection);
   50:                             addEndpointAddressFilter(routerName, appNameJustName, appNameJustNameAddress, routingSection);
   51:                         }
   52:                         serverManager2.CommitChanges();
   53:                     }
   54:                 }
   55:   
   56:             }
   57:         }

Limitations

There is a performance overhead incurred in the Layback Routing process of attempting to successfully connect a message with the appropriate Definition version. Each request for an older version takes Processing/CPU resources to check the Persistence Store for matching correlation keys, and to return an error to the Routing Service (it's local, so no additional network bandwidth is consumed).

 

It is possible to optimize this scenario further. The WCF Routing Service’s use of the backup list happens on every request for an old version; caching the resulting mapping between correlation keys and endpoints, possibly using AppFabric Caching, would help optimize performance and resource utilization. 

Namaste!

Versioning_Flow.PNG

Comments

  • Anonymous
    November 30, 2010
    Hmmm.  I appreciate what you're trying to do with this post - I'm always grateful to people who take the time to write up what they've learnt - but I think you could do with applying the Single Responsibility Principle to this blog post!  You start off teaching us about Layback Routing and end up teaching us about customizing deployment and somehow the former gets lost in the latter.  This would be a lot more useful to me if you split it into two posts, one about routing (which I need to sort out today) and one about custom deployment (which I might never look at).