Web Server Threading Models

July 18

Being very performance minded I’d like to take a tangent to server performance. Server performance is tightly connected to the choice of threading model. In particular I’m referring to the threading model of dynamic web pages: PHP, ASP, JSP, Rails, etc. About the worst thing you can do for performance is to have a thread just waiting. Just throwing more threads at the problem does not resolve it – though it can help. But preventing any waits at all is the superior solution by far.

At a high level there are three thread modeling choices in server design:

1) A single thread reads the request, then calls the script that generates the page and waits for the script to complete after which any final writes are completed. The thread stays with the request from the first byte read to the last byte written. There are ways to make this model even less efficient, but lets leave it at that.

This approach can result in a lot of wait time on the thread while it waits for request buffers to read and waits for ACKs when writing response buffers.

There is also an inherent issue in that all the data about the request must be collected up front before the script is called. This means waiting for all the request buffers to be delivered and, in the case of large requests (file uploads) caching the request on disk.

There are two problems with that:

a) memory (RAM or disk) is being filled up by request data and often by response data too

b) data that gets on disk may be vulnerable since it will not typically be encrypted

2) Another approach – called async I/O – is to keep a pool of threads that is used to read requests a buffer full at a time. The thread is active only long enough to process that one buffer and then the thread goes back into the pool. Once the request is completely assembled a script is called and the server waits for the script to complete.

This approach gets past the issue with waiting the reads and writes (when response buffers are used) but still has the memory issues, security issues and the wait on the script. Another way to state that is the script *must* return a complete response and that the server dedicates at least one thread to the script for the entire duration of the response generation.

An exception is when response buffering is not used, however, when response buffering is not used the async I/O benefits are lost for the output.

Note the server’s thread line has holes in it now – a good thing because it frees resources.

3) Another approach – called full async – uses a pool of async I/O threads for reading and writing to the client. However, it also allows for the script generating the response to be fully async.

Note another hole appears during the run of the script because the script has made an async operation itself. This capability is very important now that we use web services so often – it makes no sense to block a thread while calling a web service.

The most evolved examples of full async are perhaps better named streaming async. Streaming async gets around the problem of caching files for upload or download – while still retaining the async I/O model.

I myself have developed a couple of streaming (full) async servers at my work, unfortunately they are not open source. At some point I’ll discuss streaming async in more detail.

The full async server concept is gaining ground. Some examples are the new comer nodejs which does not send a response until you call ‘end’ on the response object. Meaning the script’s thread can exit without the response being sent. This is not at all like PHP, ASP or other standard web servers.

Another example is upcoming Ruby on Rails version 3, check out the
async_sinatra project and async_rails project. They are going so far as to make even calls to the database async – a great idea! Keep in mind my articles about Nginx when checking out Thin.

Yet another example is the Kayak HTTP Server’s responder interface. Kayak is a Dotnet based web server.

In the world of Java, the soon to be released Jetty 7 has extended the Continuations API (previously only used with Comted) for all web scripts so now Java 3.0 servlets can be full async as well.

From the examples we can see that only in the last year has the full async model become trendy, and for good reason, it is very important for scalability of a web server.

About Andy White

Author of 37 articles on this blog.

For nearly twenty years I have enjoyed and studied post relational (aka nosql) databases. I also study application framework architectures and do a fair bit of web development. For more information view my profile on LinkedIn.


Posted by on 2010-Jul-07 in Beginner, Elastic Architecture

Comments Off on Web Server Threading Models

Comments are closed.