Read a great blog entry by Scott Hanselman recently talking about the parallel dilemma that I’m sure we’ll see folks face in the future with the (old/new) Parallel classes. I wanted to add a few things to this discussion as he focused on the mechanics of the parallel requests but maybe not the potential effects it could have on the macro view of your application. This was originally written as an e-mail I sent to my team but thought others might find it interesting.
There will be an inclination by people to use the new Parallel functionality in .NET 4.0 to easily spawn operations onto numerous background threads. That will generally be okay for console/winform/wpf apps – but could also be potentially bad for ASP.NET apps as the spawned threads could take away from the processing power and threads available to process new webpage requests. I’ll explain more on that later.
For example, by default, when you do something like Parallel.ForEach(…) or some such, the parallel library starts firing Tasks to the thread pool so that it can best utilize the processing power available on your machine (oversimplification but you get the idea). The downside is that the thread pool contains a finite number of worker threads threads available to a process. Granted, you have about 100 threads per logical processor in .NET 4 – but it’s worth noting.
While Scott’s entry talks about the new way to implement the Async pattern, I’ve already seen a bunch of folks use the “Parallel” class because it abstracts away some of the plumbing of the Async operations and that ease of use could become problematic.
For example, consider this code:
string[] myStrings = { "hello", "world", "you", "crazy", "pfes", "out", "there" };
Parallel.ForEach(myStrings, myString =>
{
System.Console.WriteLine(DateTime.Now + ":" + myString +
" - From Thread #" +
Thread.CurrentThread.ManagedThreadId);
Thread.Sleep(new Random().Next(1000, 5000));
});
This is a very simple implementation of Parallel’izing a foreach that just writes some string output with an artificial delay. Output would be something like:
11/16/2010 2:40:05 AM:hello - From Thread #10
11/16/2010 2:40:05 AM:crazy - From Thread #11
11/16/2010 2:40:05 AM:there - From Thread #12
11/16/2010 2:40:06 AM:world - From Thread #13
11/16/2010 2:40:06 AM:pfes - From Thread #14
11/16/2010 2:40:06 AM:you - From Thread #12
11/16/2010 2:40:07 AM:out - From Thread #11
Note the multiple thread ids and extrapolate that out to a server that has more than just my paltry 2 CPUs. This can be potentially problematic for ASP.NET applications as you have a finite number of worker threads available in your worker process and they must be shared across not just one user but hundreds (or even thousands). So, we might see that spawning an operation across tons of threads can potentially reduce the scalability of your site.
Fortunately, there is a ParallelOptions class where you can set the degree of parallel’ism. Updated code as follows:
string[] myStrings = { "hello", "world", "you", "crazy", "pfes", "out", "there" };
ParallelOptions options = new ParallelOptions();
options.MaxDegreeOfParallelism = 1;
Parallel.ForEach(myStrings,options, myString =>
{
// Nothing changes here
...
});
This would then output something like:
11/16/2010 2:40:11 AM:hello - From Thread #10
11/16/2010 2:40:12 AM:world - From Thread #10
11/16/2010 2:40:16 AM:you - From Thread #10
11/16/2010 2:40:20 AM:crazy - From Thread #10
11/16/2010 2:40:23 AM:pfes - From Thread #10
11/16/2010 2:40:26 AM:out - From Thread #10
11/16/2010 2:40:29 AM:there - From Thread #10
Since I set the MaxDegreeOfParallelism to “1”, we see that it just uses the same thread over and over. Within reason, that setting *should* correspond to the number of threads it will use to handle the request.
Applying to a website
So, let’s apply the code from the above to a simple website and compare the difference between the full parallel implementation and the non-parallel implementation. The test I used ran for 10 minutes with a consistent load of 20 users on a dual-core machine running IIS 7.
In all of the images below, the blue line (or baseline) represents the single-threaded implementation and the purple line (or compared) represents the parallel implementation.
We’ll start with the request execution time. As we’d expect, the time to complete the request decreases significantly with the parallel implementation.
But what is the cost from a thread perspective? For that, we’ll look at the number of physical threads:
As we’d also expect, there is a significant increase in the number of threads used in the process. We go from ~20 threads in the process to a peak of almost 200 threads throughout the test. Seeing as this was run on a dual-core machine, we’ll have a maximum of 200 worker threads available in the thread pool. After those threads become depleted, you often see requests start getting queued, waiting for a thread to become available. So, what happened in our simple test? We’re look at the requests queued value for that:
We did, in-fact, start to see a small number of requests become queued throughout our test. This indicates that some requests started to pile up waiting for an available thread to become available.
Please note that I’m NOT saying that you should not use Parallel operations in your website. You saw in the first image that the actual request execution time decreased significantly from the non-parallel implementation to the parallel implementation. But it’s important to note that nothing is free and while parallelizing your work can and will improve the performance of a single request, it should also be weighed against the potential performance of your site overall.
Until next time.