Condividi tramite


Parallelism in Windows Workflow Foundation (WF)

QUESTION: Does a ParallelActivity start one thread for each branch? How many threads will it use for processing?

ANSWER: The short answers are "no" and "1". If you've heard it before, then these answers make sense. If not, then read on for an explanation of parallelism in WF.

To promote programming simplicity for both the custom activity and workflow writer, WF has chosen to make the guarantee that only one .NET thread will be executing any portion of a workflow at any given time. This means that the handler for your CodeActivity, the Execute method for your custom activity, and the sequence in your EventHandlerActivity will never have to worry that some other part of the workflow is executing at the same time. If you think about it a bit you will see that this greatly simplifies programming WF applications and is one of the few reasons why writing complex custom activities is a surprisingly simple task.

Isn't that bad?

The usual knee-jerk reaction is outcry about the loss of multi-threaded parallelism. Isn't that a step backward? The answer is no, this is not a step backward. First, we are only stating that there is a single threaded nature within a single instance of a workflow. This means that if you have two instances of the same workflow executing simultaneously then they will exhibit true multi-threaded parallelism. 

Second, let's remember the nature of the workflow beast. Workflows are meant to coordinate tasks, both human and computer, in an event driven world over an unknown amount of time. The workflow itself processes in short bursts with long periods of dormancy in between. For example, a workflow might send an e-mail requesting that a task be performed and then persist to the database. Only when the task is now complete will the workflow come back to life and process some more ... and this processing should be limited to deciding what action should be taken next and delegating that to some external source whether that be a human or a service added to the WorkflowRuntime.

This behavior of a single workflow instance means that true parallelism is wholy unnecessary. If the workflow assigns 10 tasks in parallel then it is highly unlikely that there will be a processing bottleneck when collating the results which return scattered across time. While I've got no hard evidence to back this up, it is my opinion that a single thread per instance actually improves the performance of WF as opposed to hindering it when considering the reduced complexity in activity execution code, the lack of neccessity for locking constructs, and the burst processing nature of workflow.

How it works

WF is a scheduled environment. Abstractly, you can consider that every instance of a workflow has its own scheduler which is just a queue of delegates. The scheduler simply loops through a 2 step sequence: dequeue the next delegate and call it. If there are no more items on the queue then the workflow is idle. If we consider that the scheduler just has one thread and invokes the delegates synchronously then we see where the single threaded guarantee comes from.

During the execution of a delegate (like Activity.Execute) there are several occurrences which can cause new items to be added to the scheduler queue. ActivityExecutionContext.ExecuteActivity() can be used to schedule a child's execution, throwing an exception will cause the runtime to schedule the HandleFault method for the activity, calling Activity.Invoke<>() will cause the specified delegate to be scheduled, and returning a value of ActivityExecutionStatus.Closed will cause the runtime to schedule the OnClosed method. These are just some of the triggers which cause new items to be added to the queue.

Extended ParallelActivity walkthrough

The ParallelActivity, when executed, will schedule the Activity.Execute method for each of its direct children and subscribe to the Closed event for each child. The result is that the scheduler queue will look something like this (first item to be dequeued is on left):
{child1.Execute, child2.Execute, child3.Execute}

Calling child1.Execute might result in a DelayActivity's Execute to be added to the queue: {child2.Execute, child3.Execute, delay1.Execute}

Now consider if child2 contains a single CodeActivity and child3 is empty:
{delay1.Execute, code1.Execute, child3.OnClose}

Up to this point we have had purely interleaved execution. Draw it out and remember that in normal execution an activity will have Execute schedule, then it will schedule any work it needs to do, then it will have OnClose scheduled, and then anyone listening to the Closed event will be scheduled. Knowing this you can walk through almost any chain of activities.

Back to our queue, we will next see the delay disappear because it has added a timer to the workflow, we will see the code activity execute, and the third child will process its close:
{code1.OnClose, parallel.OnChildClosed(child3)}

Without drawing it out, let's say that the delay was a long enough one to let child2 close as well before the timer is fired. The result will be that the ParallelActivity's Closed handler will determine that the parallel cannot yet close because it has an outstanding executing child and the scheduler will run out of items in the queue and mark the workflow as idle. The next exciting thing to happen is the timer will schedule a callback for the delay:
{delay1.OnTimer}

Again, this will cause child1.OnClose to be scheduled which will cause parallel.OnChildClosed(child1) to be scheduled which, finally, will result in parallel.OnClose being scheduled.

The important thing to notice is that with non-blocking activities we get predictable interleaved execution. But, as soon as we add a blocking activity, the delay in our case, we get execution that approaches real world workflow scenarios. Imagine that each branch has an event on which it is waiting ... the execution is no longer just interleaved, but whichever branch's event fires first gets executed first. Unless all of the branches receive their events simultaneously, we get parallel processing with a single thread of execution.

Comments

  • Anonymous
    March 01, 2006
    Nate, thats a very good blog mate!

    I was struggling to understand parallelism and your explanation has made this very clear. Keep up the good work!
    Raja

  • Anonymous
    March 27, 2006
    PingBack from http://www.marcmercuri.com/PermaLink.aspx?guid=8b26c59a-c39f-498b-af12-f73a16ce9b99

  • Anonymous
    April 24, 2006
    I am working with a document approval workflow,when a document is created,the owner will choose several departments to get aprrove.only after all these approvers have  done the approve action,the document 's  state can be changed to approved,if one of the approvers reject it,the document's state will be changed to rejected.If one of the approvers delay the action,the document's state will be changed to rejected too.How can i desin this workflow use wwf?
    I have thought for a long time,but have no answer,can you help me ?

  • Anonymous
    April 25, 2006
    Fanse, you have several options for design here, so let me enumerate a couple and discuss the relative pros and cons.

    First, let me clarify the scenario which these solutions will target.  This way the comment will still have value (hopefully) even if my details do not match completely with yours. Scenario: There is a single workflow instance which manages the approval of a document.  When the document is submitted, the workflow is started and the document is in the Submitted state.  An unknown number of people will be asked to review the document (based on roles, arbitrary rules, some database - this is not important).  A Rejected or Delayed answer from any one person will put the document in the Rejected state and end the workflow.  If all reviewers return Approved then the document will be in the Approved state and the workflow will end.  (These conditions which cause the workflow to end will be discussed later.)

    State Machine Solution: This solution models the workflow based on the states of the document.  As such, there will be four states - Submitted, Approved, Rejected, and Completed.  The fourth state is a trigger for the workflow to complete ... we will revisit this later.

    The Submitted state will have a state initialization handler which calculates the list of approvers and then notifies them that they must approve the document.  How do we notify them?  Well, maybe you want to model this as a SendEmailActivity or maybe it is a call to a local service through a CallExternalMethodActivity or possibly it is a custom activity which calls some custom runtime service to add entries to a database.  The important thing is this is an implementation detail which we don't need to delve into right now.

    Submitted: StateInitialization
     <Code Name="CreateApproversList" />
     <Replicator Name="ForEachApprover" InitialChildData=ActivityBind("createdApproversList")>
       <CreateTask Name="NotifyApprover"/>
     </Replicator>

    The Submitted state will also have an event called ApproverResponseReceived.  Once again, the nature of this event is not important - it might be a HandleExternalEventActivity or it might be a custom activity based on queues.  We'll assume that the data delivered by the event is the name of the approval and the response (Approved, Rejected, Delayed).  When this event is raised we will check the response and if it is Rejected or Delayed we will move the state machine to the Rejected state.  If it is Approved then we will remove the approver's name from our list.  If our list is now empty then we move to the Approved state otherwise we stay in the current state.

    Submitted: ApproverResponseReceived
     <ReceiveResponse/>
     <IfElse>
       <Branch1 (Response==Rejected || Response==Delayed)>
         <SetState State="Rejected"/>
       </Branch1>
       <Branch2>
         <Code Name="RemoveApproverFromList"/>
         <IfElse>
           <Branch1 (createdApproversList.Count == 0)>
             <SetState State="Approved"/>
           </Branch1>
         </IfElse>
       </Branch2>
     </IfElse>

    The Approved and Rejected states both notify the host of the result (or set an appropriate outbound parameter) and then move to the Completed state which causes the workflow to exit.

    Benefits:

    This solution has the benefit of mapping directly to how you thought of your problem.  The workflow will be in a state corresponding to the document state.  Additionally, if you want to allow resubmission of the document from the Rejected state then it would be a simple task to add a new event handler to Rejected which transitions the workflow back to the Submitted state.

    Drawbacks:

    This solution does not, however, have nice handling of the approvers.  Imagine that we wanted to do something more complex with the approvers such as allow them to delegate the task, remove themselves from the list, or have specific flow based on their inputs.  It could all be modeled with event handlers in the state machine which all pass the approver's name as a parameter but this is not as nice as the Sequential Solution which we will show below.

    Sequential Solution:

    We'll start in the same place with a CodeActivity which generates the list of approvers.  This will then be fed to a Replicator which, this time, contains a more complex child which will run instances in parallel.

    The Replicator's child will be responsible for handling the entire interaction with one approver.  In our case this is simply to notify the approver of the waiting task and then wait for the approver's response.  We will make use of the ExternalDataExchangeService correlation features to map an inbound event to the correct HandleExternalEventActivity.  This is necessary now that we are running multiple instances of the same event in parallel.

    The skeleton workflow:

    <Sequence>
     <Code Name="CreateApproversList" />
     <Replicator ExecutionType=Parallel InitialChildData=ActivityBind("createdApproversList")>
       <Sequence>
         <CallExternalMethodActivity InterfaceType="IApproverTask" MethodName="CreateApproverTask" CorrelationToken="token1"/>
         <HandleExternalEventActivity InterfaceType="IApproverTask" EventName="TaskCompleted" CorrelationToken="token1"/>
       </Sequence>
     </Replicator>
    </Sequence>

    The IApproverTask interface might look something like:

    [DataExchangeService]
    [CorrelationParameter("approverName")]
    public interface IApproverTask
    {
     [CorrelationInitializer]
     void CreateApproverTask(string approverName, TaskData data);
     [CorrelationAlias("approverName", "e.ApproverName")]
     event EventHandler<TaskCompletedEventArgs> TaskCompleted;
    }

    I'm going to assume that the ExternalDataExchangeService and its related attributes, activities, and correlation features are understood.  If this is not the case you can find more data on these in the help files and samples.

    Last, but not least, let's mention the Replicator's various events and its UntilCondition.  We'll specify a ChildInitialized handler which will be responsible for setting the ApproverName on the CallExternalMethodActivity.  We'll also specify a ChildCompleted handler which, in the case of a rejection or delay, will set the workflow's output variable to be Rejected.  Finally, the UntilCondition will be set to return replicatorInstance.AllChildrenComplete || result == Rejected.

    Benefits:

    If the interaction with an approver is made more complex it will be a simple matter to model the interaction in the replicator's child sequence.  Note that you can now think of the problem as the interaction with a single approver instead of as a generic interaction with many approvers.  From a flow point of view, this is much easier to follow the logic: Create an approvers list, for each approver we notify of the task and wait for completion, and we short circuit the logic and complete if a Rejected/Delayed response is returned.

    Drawbacks:

    Unless you want to implement your own correlation your hand is forced in the choice of communications: ExternalDataExchangeService is the way to go here.  And while modeling the interaction is simpler, providing the "loop" will be a little less natural - we'd have a while loop which checked the result property for Approved.  Finally, while we are modeling flow in a more readable fashion we are no longer modeling the state of our document like we were with the State Machine.

    Conclusion:

    These are just two solutions and there are likely to be countless others.  These have also been somewhat simplified for the purposes of illustration - we've gone into detail of the workflows but haven't gone into the details of the system existing just outside the workflow.  Hopefully this will get you started with some ways to approach the problem.  If you have any questions, check the docs and samples first and then feel free to ask.

  • Anonymous
    April 25, 2006
    ntalbert,Thanks for your so detailed answer,I have thought for a long time,but I never noticed Replcator activity.Now I will model this workflow use state machine workflow,about sequential workflow,because the rejected document will be re edited or re submitted,I fell a little difficult modelling it using Sequential workflow.Thanks again in my heart.

  • Anonymous
    February 06, 2007
    In questi giorni, a seguito di una consulenza su WF, ho potutoverificare in dettaglio alcuni aspetti

  • Anonymous
    July 03, 2007
    .style1 { color: black; } The very first time I saw Windows Workflow Foundation (WF), I slapped together

  • Anonymous
    July 29, 2007
    PingBack from http://www.kcdholdings.com/blog/?p=75

  • Anonymous
    August 09, 2007
    Is it possible to create this bug-tracking workflow - Sequential workflow on the fly? My workflow designer is hosted on Web.

  • Anonymous
    September 12, 2007
    You might think that the ParallelActivity in Windows Workflow Foundation is misnamed. At the very least,

  • Anonymous
    September 12, 2007
    You might think that the ParallelActivity in Windows Workflow Foundation is misnamed. At the very least