Visual Studio 2010 and .NET Framework 4 Training Kit - May Preview
I would also like to talk about the content in this training kit. For example, I’d like to talk about concurrency.
There are various approaches to writing concurrent code. We will use the training kit to explore these various options.
Method 1 - Ordinary Threads | This is the old school option where you have the ultimate power but suffer from the ultimate amount of work developing and debugging. There’s a huge API to help you do almost anything.This is the lowest level, most detailed approach. |
Method 2 - Using the ThreadPool | A little better than option 1, because the pool of threads are available and the number of threads being created and destroyed are a little better managed.Less difficult but more limited than method 1. |
Method 3 - Using Tasks (Task Parallel Library) | Tries to give you the best of both worlds. The API is very expressive but you are helped by the new “concurrent scheduler” which is easier and more powerful than the ThreadPool in Method 2.Parent/Child relationships can be setup to help. |
Method 4 - Parallel Static Class | Parallel.For loop is powerful for executing code in a for loop more quickly. |
Once you install the kit, it will look like this. Click “Demos.htm” below.
Notice there a number of apps here that we can run through to get a better understanding.
The idea behind this sample is to compare a PLINQ query to a regular non-parallel query:
You can see that the PLINQ query finished the work in half the time.
Startup Code
The code for each of the two queries can be found below. Notice that on line 2 we instantiate an object for our sequential query. Notice that it uses the generic interface IEnumerable<PopNmae>. Note that the parallel version uses “ParallelQuery<>”.
The code below uses PLINQ:
- Parallel LINQ (PLINQ) is a parallel implementation of LINQ to Objects.
- PLINQ implements the full set of LINQ standard query operators as extension methods for the IParallelEnumerable interface and has additional operators for parallel operations.
- PLINQ combines the simplicity and readability of LINQ syntax with the power of parallel programming.
- Just like code that targets the Task Parallel Library, PLINQ queries scale in the degree of concurrency based on the capabilities of the host computer.
Notice line 18. PLINQ queries use some special expressions such as:
- AsParallel
- WithDegreeOfParallelism
Instances of ParallelQuery bind to Parallel LINQ extension methods to execute in parallel.
1: // Sequential Version of query
2: private static IEnumerable<PopName> seqQuery;
3:
4: // Parallel Version of query.
5: private static ParallelQuery<PopName> parQuery;
6:
7:
8:
9: private void InitializeQueries()
10: {
11: seqQuery = from n in names
12: where n.Name.Equals(queryInfo.Name, StringComparison.InvariantCultureIgnoreCase) &&
13: n.State == queryInfo.State &&
14: n.Year >= yearStart && n.Year <= yearEnd
15: orderby n.Year ascending
16: select n;
17:
18: parQuery = from n in names.AsParallel().WithDegreeOfParallelism(ProcessorsToUse.Value)
19: where n.Name.Equals(queryInfo.Name, StringComparison.InvariantCultureIgnoreCase) &&
20: n.State == queryInfo.State &&
21: n.Year >= yearStart && n.Year <= yearEnd
22: orderby n.Year ascending
23: select n;
24:
This is the declaration for “names,” which will hold thousands of names to be used to test the performance of various queries.
private List<PopName> names = new List<PopName>();
Here's the code actually populates the “names” object with data. This is the object that will be used to test the performance of our various queries.
Console.Write("Loading XML names...");
XDocument doc = XDocument.Load("popnames.xml");
XElement root = doc.Root;
foreach (XElement child in root.Elements())
{
PopName name = new PopName();
name.Name = child.Attribute("Name").Value;
name.Gender = (NameGender)Enum.Parse(typeof(NameGender), child.Attribute("Gender").Value);
name.State = child.Attribute("State").Value;
name.Year = int.Parse(child.Attribute("Year").Value);
name.Rank = int.Parse(child.Attribute("Rank").Value);
name.Count = int.Parse(child.Attribute("Count").Value);
names.Add(name);
if (names.Count == count) break;
}
Bascially, the data we are searching in looks like this:
<Name State="AK" Year="1960" Rank="1" Gender="Male" Name="David" Count="152" />
<Name State="AK" Year="1960" Rank="1" Gender="Female" Name="Mary" Count="78" />
<Name State="AK" Year="1960" Rank="2" Gender="Male" Name="Michael" Count="139" />
<PopularNames From="1960" To="2005">
<Name State="AK" Year="1960" Rank="2" Gender="Female" Name="Linda" Count="56" />
<Name State="AK" Year="1960" Rank="3" Gender="Male" Name="Robert" Count="135" />
<Name State="AK" Year="1960" Rank="3" Gender="Female" Name="Karen" Count="53" />
<Name State="AK" Year="1960" Rank="4" Gender="Male" Name="John" Count="126" />
<Name State="AK" Year="1960" Rank="4" Gender="Female" Name="Debra" Count="50" />
<Name State="AK" Year="1960" Rank="5" Gender="Male" Name="James" Count="123" />
<Name State="AK" Year="1960" Rank="5" Gender="Female" Name="Susan" Count="50" />
<Name State="AK" Year="1960" Rank="6" Gender="Male" Name="Mark" Count="91" />
<Name State="AK" Year="1960" Rank="6" Gender="Female" Name="Elizabeth" Count="47" />
As you can see from the data, the data needs to be sorted and filtered.
Incidentally, you will need the following “using” statements:
using System.Linq;
using System.Xml.Linq;
This concludes the reviewing of the code in “BabyNames.”
More performance issues with LINQ can be found here:
This is a nice straightforward example that compares two LINQ queries. One uses the parallel syntax and the other query is sequential. As with all the other performance numbers you will notice that query that uses parallelism can execute code in half the time.
Notice the parallelized version runs in almost half the time. This little table compares the two approaches.
The NonParallel Method
1: static void NonParallelMethod()
2: {
3: for (int i = 0; i < 16; i++)
4: {
5: Console.WriteLine("TID={0}, i={1}",
6: Thread.CurrentThread.ManagedThreadId,
7: i);
8:
9: SimulateProcessing();
10: }
11: }
The Parallel Method
1: static void ParallelMethod()
2: {
3: Parallel.For(0, 16, i =>
4: {
5: Console.WriteLine("TID={0}, i={1}",
6: Thread.CurrentThread.ManagedThreadId,
7: i);
8:
9: SimulateProcessing();
10: });
11: }
A more simple example
You can test this code yourself very easily. Having that the first time as is, out of the box. This code runs in a concurrent fashion by default. However, you can eliminate the keyword “AsParallel” and watch the query run sequentially and take twice the time as the concurrent version of the same query.
This example below runs a PLINQ query. Notice line 4 indicates the AsParallel construct.
1: static void Main(string[] args)
2: {
3: IEnumerable<int> numbers = Enumerable.Range(1, 1000);
4:
5: ParallelQuery<int> results = from n in numbers.AsParallel()
6: where IsDivisibleByFive(n)
7: select n;
8:
9: Stopwatch sw = Stopwatch.StartNew();
10: IList<int> resultsList = results.ToList();
11: Console.WriteLine("{0} items", resultsList.Count());
12: sw.Stop();
13:
14: Console.WriteLine("It Took {0} ms", sw.ElapsedMilliseconds);
15:
16: Console.WriteLine("\nFinished...");
17: Console.ReadKey(true);
18: }
19:
20: static bool IsDivisibleByFive(int i)
21: {
22: Thread.SpinWait(2000000);
23:
24: return i % 5 == 0;
25: }
Here are the performance results of that query.
The non-parallel version takes almost twice as long to execute. Notice that line 1 does not have the AsParallel() construct and therefore runs on a single thread.
1: IEnumerable<int> results = from n in numbers
2: where IsDivisibleByFive(n)
3: select n;
4:
Here is the performance results for it.
This next code sample will compare traditional threads to the task based approach that will be part of Visual Studio 2010.
Actually, there are three types of threading examples in this project:
Traditional threads
The code below represents the traditional approach to thread programming, which is both powerful and complex. There's more you can do with this rich API, but it comes at a cost of complexity and difficulty in debugging. This would be the code you will use if you wished to create your own abstraction layer for multithreaded programming.
1: static void RunThreads()
2: {
3: Stopwatch sw = Stopwatch.StartNew();
4:
5: Console.WriteLine("Running Threads...");
6: Thread t1 = new Thread(DoSomeWork); t1.Start(3);
7: Thread t2 = new Thread(DoSomeWork); t2.Start(6);
8: Thread t3 = new Thread(DoSomeWork); t3.Start(9);
9: Thread t4 = new Thread(DoSomeWork); t4.Start(12);
10: Thread t5 = new Thread(DoSomeWork); t5.Start(15);
11: Thread t6 = new Thread(DoSomeWork); t6.Start(18);
12: Thread t7 = new Thread(DoSomeWork); t7.Start(21);
13: Thread t8 = new Thread(DoSomeWork); t8.Start(24);
14:
15: t1.Join();
16: t2.Join();
17: t3.Join();
18: t4.Join();
19: t5.Join();
20: t6.Join();
21: t7.Join();
22: t8.Join();
23:
24: sw.Stop();
25: Console.WriteLine("It Took {0} ms", sw.ElapsedMilliseconds);
26: }
Thread pool threads
Using a thread pool simplifies the routing of concurrent applications. The creation and destruction of the threads is done for the developer by the thread pool manager. This approach is appropriate for many scenarios, but lacks the flexibility of traditional threads, as seen above.
1: static void RunPool()
2: {
3: // No *EASY* way to measure
4: Console.WriteLine("Running Pool...");
5:
6: ThreadPool.QueueUserWorkItem(DoSomeWork, 3);
7: ThreadPool.QueueUserWorkItem(DoSomeWork, 6);
8: ThreadPool.QueueUserWorkItem(DoSomeWork, 9);
9: ThreadPool.QueueUserWorkItem(DoSomeWork, 12);
10: ThreadPool.QueueUserWorkItem(DoSomeWork, 15);
11: ThreadPool.QueueUserWorkItem(DoSomeWork, 18);
12: ThreadPool.QueueUserWorkItem(DoSomeWork, 21);
13: ThreadPool.QueueUserWorkItem(DoSomeWork, 24);
14: }
Task based concurrency
The new task based approach to threading represents the best of both worlds. This new approach gives you both the power and flexibility a traditional threads, but also gives you the simplicity in terms of creating, managing, and destroying threads. There is a fairly robust API that allows you to do such things as establishing a parent child relationship between two threads. This approach also leverages the new concurrent scheduler that will be part of the .NET Framework 4.0.
1: static void RunTasks()
2: {
3: Stopwatch sw = Stopwatch.StartNew();
4:
5: Console.WriteLine("Running Tasks...");
6: Task t1 = Task.Factory.StartNew(DoSomeWork, 3);
7: Task t2 = Task.Factory.StartNew(DoSomeWork, 6);
8: Task t3 = Task.Factory.StartNew(DoSomeWork, 9);
9: Task t4 = Task.Factory.StartNew(DoSomeWork, 12);
10: Task t5 = Task.Factory.StartNew(DoSomeWork, 15);
11: Task t6 = Task.Factory.StartNew(DoSomeWork, 18);
12: Task t7 = Task.Factory.StartNew(DoSomeWork, 21);
13: Task t8 = Task.Factory.StartNew(DoSomeWork, 24);
14:
15: t1.Wait();
16: t2.Wait();
17: t3.Wait();
18: t4.Wait();
19: t5.Wait();
20: t6.Wait();
21: t7.Wait();
22: t8.Wait();
23:
24: sw.Stop();
25: Console.WriteLine("It Took {0} ms", sw.ElapsedMilliseconds);
26: }
Thanks for reading. I’d love to hear from you. Bruno – bterkaly@microsoft.com