Asynchronous Internet Programming

Programming an Internet application requires several asynchronous events such as network events, file I/O, timers and user interactions. In this article I walk through some existing design patterns for asynchronous programming.

Most programmers think synchronously, i.e., in a single control flow. For example, in a web server implementation, a socket is opened for listening, and when an incoming connection is received it accepts the connection, receives the request, and responds to the client. There are several steps in this flow that can block, e.g., waiting for incoming connection or incoming request. If a resource such as thread or process is blocking, it causes inefficiency in the overall system design. For example, if one thread is blocked serving a request from one client, another thread needs to be created to serve request from another client. Creating a thread-per-request causes too many thread creation and deletion in the system when the request-per-minute load increases.

Event handlers:
The first approach is to break your program into smaller chunks such that each chuck is non-blocking. Then use a event queue to schedule these chunks. The earlier windows programming (WinProc) falls under this category. The main loop just listens for the events on an event queue. When a user input is received such as mouse movement, click or keyboard input, a event is dispatched to the event queue. The event handlers installed in the program handle the events, and may post additional events in the queue. External events such as socket input can also be linked to this event queue. One problem with this design is that several event handlers need to share state (hence global variables) which results in complex software.

while not queue.empty():
msg = queue.remove(0)
dispatch(msg) # calls the event handler for msg

Some event-driven programming languages such as Flash ActionScript enforces this event handler design practice coupled with object-oriented design. Typically, in ActionScript, individual object has event queue and can dispatch events, instead of exposing a global event queue to the programmer. An object can listen for events from another object, e.g., a controller program can listen for click event on the button, and handle it appropriately. Thus, there can be some data hiding, information separation, and better software design using event handlers. In other languages (C++) libraries such as Poco that facilitate event driven object oriented programming. Nevertheless, one problem with this design is that logically similar source code needs to be split across different functions and methods, and sometimes different classes. For example, a timer event handler defined as a separate function from the timer initialization code.

button.addEventListener('click', clickHandler);
...
function clickHandler(event:Event):void {
...
}


Inline Closures:
Java as well as ActionScript solve this problem using inline closures. In particular, you can define event handlers inline within a method definition, thus keeping the logically similar code together in your program file. However, this is not always possible, e.g., if the same timer needs to be started from several places in the code. In this case, it does make sense to keep the timer handler as a separate method.

timer.addEventListener('timer', function(event:TimerEvent):void {
... // process timer event
}


Deferred Object:
Twisted Framework solves this problem using the Deferred Object pattern. The idea is as follows: instead of breaking the control flow source code every time a blocking operation occurs, the source code is broken when the result of the blocking operation is needed. This makes a lot of difference in programming, and makes the software cleaner. For example, earlier when a socket connection was initiated we needed to install an event handler for the success and failure result and perform the rest of the processing in those event handlers. Now, using the Deferred Object pattern, the socket.connect returns a deferred object, which is used in the same control flow as if it were a result of the connect method. The result object can be passed around elsewhere like a regular object. Only when we need to do something on completion of the connection, we wait on the result object'. This can be done either synchronously using result.wait() or asynchronously by installing a success and error callbacks. Provisions are there to allow one deferred object to wait on result of another deferred object before proceeding. An example follows:

result = socket.connect(...)
result.wait()
if result:
... # connection successful


Co-operative multitasking:
Python's multitask module allows cleaner source code by taking advantage of co-operative multitasking. The idea is to build application level threads of control which co-operate during blocking using the built-in yield method. A global scheduler runs in the main application. I find this to be the most clean design in my programming. An example follows.

data, addr = yield multitask.recvfrom(socket, ...)
yield multitask.sendto('response', addr)


One major problem with these approaches is that because of the inherent underlying single event queue software architecture, we cannot take advantage of multi-processor CPU architecture easily. Thus, we must use multi-threaded design in our software. Most of the earlier designs can be extended to multi-threaded application with some care. For example, one can run multiple global event loops listening on a single shared and locked event queue or one event-queue per thread where the dispatcher schedules it to the appropriate thread's queue. Clearly, the first one is more efficient from a queuing theory perspective. Care must be taken to lock the critical section in event handlers, or control flow that may get accessed from different threads at the same time.

Consequently, we get several multi-threaded designs: thread-per-request, thread-pool, two-stage thread-pool, etc. The performance comparison of these designs in the context of a SIP server is shown in my paper titled Failover, Load Sharing and Server Architecture in SIP Telephony, Sec 6.

No comments: