18 January 2008

When documentation is a wee bit off

Humans are never perfect, and since humans write documentation, it isn't always 100% accurate.

Take the documentation on MSDN regarding Managed Thread Pools, in the "Maximum Number of Thread Pool Threads" section (emphasis mine):

The number of operations that can be queued to the thread pool is limited only by available memory; however, the thread pool limits the number of threads that can be active in the process simultaneously. By default, the limit is 25 worker threads per CPU and 1,000 I/O completion thread.

25 threads per CPU? Not bad, pretty good. Except it is a little bit off. In fact, it is missing a 0. Yes, it isn't 25 threads per CPU, it is 250 threads per CPU. An error of 10 fold.

This has reared its head a few times with the Community Server add-on Mail Gateway. It makes use of the thread pool to process email messages as they come in and send them to the CS site. One problem was that when there is a backlog of messages, it can hammer the site quite a bit. If you have some issues connecting to the mail server, or configuration issues, etc, and messages build up in the mailbox (ie, couple thousand, or some clients 50k+ messages), then when things get resolved, it tries to process the whole back log, but ends up hitting the site so hard it will fail processing messages and continue to retry them, making it work through the messages very slowly.

When Mail Gateway was written, I had read about the "25 threads per CPU" and figured CS could surely handle a maxed out thread pool no problem... essentially 25 users on a site? That's not bad. And if they had a dual CPU machine, thats 50... that's maybe pushing it, but still should work ok.

However, we first starting using the managed thread pool like 2 years ago. Since then, we've seen a major change with multi-core machines.

So now, look at the issue of hammer a site with Mail Gateway. 250 threads? All hammering on a site? Ohh yeah, I could see that one causing a problem. But what if the client has a multi-core machine? Say it is a dual quad-core CPU box... you're looking at 8 CPUs as far as it is concerned. You're talking 2,000 threads. Think 2,000 messages being submitted concurrently to a site would bring it to its knees? Yeah, I think so.

The 25 vs 250 thing is pretty big, especially when you take into account the "per CPU" bit.

So the solution? Override the number of threads available by specifying it in the application on start up using ThreadPool.SetMaxThreads(). This way, we can also avoid the scaling per CPU issue as well... we don't want to send more messages because there is another CPU, we want a fixed number. .NET 1.1 (when we originally wrote the code) didn't have a method available to set the number of threads, but with .NET 2.0, we now can. Expect an updated build of the Mail Gateway service soon with the change. It will now default to only 15 threads, but can be configurable in the mailgateway.config file.

Posted 4:13 pm | Software

Comments

Dave Donaldson on 1/19/2008 at 6:32 am

So how did you determine that it's 250 and not 25? I'm curious because that particular number is extremely important to the .NET Framework, so if it's wrong in the documentation, Microsoft needs to know about it. Thread pool counts are serious business.

Dave Donaldson on 1/19/2008 at 6:37 am

And actually, a default size of 250 seems pretty high, so I'd love to hear about how you determined that.

Ken on 1/19/2008 at 6:46 am

I wrote a little test console app that called ThreadPool.GetAvailableThreads and it returned 250 worker threads and 1000 io threads, and this was on a single CPU'd VM. Then shut it down, switched the VM to 2 CPUs, fired back up, and it returned 500. Then took the app over to my dual core laptop, fired it up and got 500 as well.

VMware Fusion can only do 2 cpus in a virtual instance, but on my server, going to try running it on my server which is a dual quad-core system.

It looks like the IO threads is accurate, but the worker threads is missing a zero.

Dave Donaldson on 1/19/2008 at 9:26 am

Interesting. Seems like a pretty big oversight on Microsoft's part to leave off that zero.

Ken on 1/19/2008 at 10:51 am

Yeah, think I'll email The Gu and see if he could report it to someone.

Thomas Freudenberg on 1/22/2008 at 12:08 am

The default number of threads *used* to be 25, but was increased to 250 in .NET 2.0 SP1. See www.bluebytesoftware.com/.../PermaLink,guid, for the details.

Thomas Freudenberg on 1/24/2008 at 12:44 am

Oops, the link is broken. Try this one: www.bluebytesoftware.com/.../CommentView,gui