Software API Design

In this article I present my view on the design of software APIs. Several other people have written extensively about the importance of API design and best practices, mostly in the Java community, but still I find so many poorly designed software APIs.

What is a Good API?

At the high level only a few design principles: complete, easy-to-use and minimum, are enough to design good API. The API should be complete, without any missing element, to achieve all the necessary functions of the software. At the same time, the API should be minimum, without much redundancy. For example, if the same thing can be done using two method, pick the best, and use that. Most importantly the API should be easy to use.

Consider the Unix socket API. It provides the core functions such as connect, sendto, bind, using explicit methods whereas additional rarely used functions are supported using setsockopt or ioctl. The best feature of the API is that it treats a socket similar to a file descriptor, so you could indeed use the file related functions on a socket.
s = socket(...);
write(s, ...); /* use it like a file */

Asynchrounous I/O

The annoying part in Unix socket is the select and poll set of function for doing asynchronous I/O. The primary reason is that there is no standardized way for asynchronous API in Unix. On the other hand Windows invented the WSAAsyncSelect function which is more difficult to use. An event-based API for asynchronous I/O is clean and easy to use, provided the event is dispatched by the relevant object.
s = new Socket();
s.addEventListener("socketData", handlerFunction);

Return vs exception

Exception is better for indicating failure cases, if the failure is less likely to happen. On the other hand, return value of failure is suitable if the failure is part of the result. For example, a findUser() function can return the User object on success, and null on failure such as not-found. What if there is other kind of failure? Should it return null or throw an exception?

Exceptions make more cleaner code, at least in Python. This avoids several if-else constructs to check the return values, and you can accumulate all error handling at one place, either in this function or in the caller stack frame.

Value vs reference

You would have studied pass-by-value and pass-by-reference in your programming class. This topic goes beyond that. For example, if a Camera object represents a default-camera for your machine, should the Camera object automatically switch to the new camera if the user changes the default camera from the control panel independent of your software?
cam = Camera.getCamera("default")
# will cam be the current snapshot of default-camera or will it automatically
# change when default-camera changes?
In your software, if you need to represent the local logged in user as a LocalUser object, should you create a new object when the login changes or should the same object update its state to reflect the new logged in user? If you use a local structure to represent the local listening IP address of your network application, should this structure automatically change its IP value whenever your machine's IP address changes?

The value object is easy to implement and understand. However, if an application needs to detect any change in the value they need periodic polling or event dispatching on change. The reference object is more convenient and clean to the application developer, but needs additional work in the implementation of the API. One option is to have generic reference wrapper, which can be used to convert any value object to reference object, as long as the change detection is uniform.

Synchronous vs non-blocking

Synchronous APIs are easy to understand and use whereas non-blocking event-based APIs are difficult to use. Non-blocking calls have advantage that your thread is not held-up waiting for the response. This is particularly a problem for single threaded software systems. In some platforms, e.g., ActionScript on (single threaded) Flash Player, there is no choice, and you must do non-blocking calls. Python provides constructs such as yield, which allows you to write synchronous co-operative multitasking software. Thus, even though your application code looks like synchronous, it actually yields to other task behind the scene. Clearly critical section and shared resources must be protected appropriately for read and write access as needed.
task1: data = yield multitask.recv(sock, timeout=10)
task2: yield multitask.send(sock, somedata)

Generic vs specific

While most API designers tend to write methods that are specific for one task, the arguments should be generic. In C++ or Java, templates allow you to specify generic algorithm. The question on generic vs specific spans more use cases: should your supply a String as your method name, or should you define an Enum and use that? If you already have a function getUser(String name), should you define another function getUserById(int id) or should you overload using argument type?

Generic API is more extensible but care must be take to handle various edge cases and throw appropriate exceptions.

Document and testing

How should the API be specified? Should there be a set of documents that describe each and every function in various use cases, or should there be built-in test cases that show how to use the API? or both? With document alone, how do you make sure that the document is updated every time someone updates the implementation? Python's doctest module allows you to integrate the unit testing in the API document itself. This cleanly makes sure that your unit test will fail if your API implementation changed.

Analogy

The single feature which improves the usability of an API is using an analogy with some existing thing. For example, a web interface modeled after Unix file I/O is very easy to understand and use. On the other hand, a completely new paradigm for your API will make it hard to understand. Other examples of existing paradigms are the listener-provider, event dispatcher, property getter setter, read-only vs read-write property, attribute vs container access, dictionary or hash map collection. For example, if you want to implement a new P2P storage module, consider exposing it as a dictionary where user can put and get values using keys, and can replace his existing code that uses local dict or HashMap with your new distribute dictionary.

Conclusion

Software APIs tend to last longer than expected. Careful thoughts in the design process can help your API withstand the test of time. When you design an API, weigh your options with respect to these properties: synchronous vs asynchronous, return vs exception, value vs reference, generic vs specific, and incorporate analogy with existing paradigm, and automatic documentation and testing tools.

Systems Software Research

A very interesting talk by Rob Pike on Systems Software Research is Irrelevant".

Some quotes from the slides (by Rob Pike):

"We see a thriving software industry that largely ignored research, and a research community that writes papers rather than software".

"Java is to C++ as Windows is to Machintosh: an industrial response to an interesting but technically flawed piece of systems software."

"Linux's cleverness is not in the software, but in the development model, hardly a triumph of academic CS (software engineering) by any measure."

"It (systems research) is just a lot of measurement: a misinterpretation and misapplication of the scientific method. Invention has been replaced by observation."

"If it didn't run on a PC, it didn't matter because the average, mean, median, and mode computer was a PC."

"To be a viable computer system, one must honor a huge list of large, and often changing, standards: TCP/IP, HTTP, HTML, XML, CORBA, Unicode, POSIX, NFS, SMB, MIME, POP, IMAP, X, ... With so many externally imposed structure, there is little left for novelty."

"Commercial companies that 'own' standards deliberately make standards hard to comply with, to frustrate competition. Academic is a casualty."

"New employees in our lab now bring their world (Unix, X, Emacs, Tex) with them, or expect it to be there when they arrive... Narrowness of experience leads to narrowness of imagination."

"In science, we reserve our highest honors for those who prove we were wrong. But in computer science..."

"How can operating systems research be relevant when the resulting operating systems are all indistinguishable? (Unix is) a victim of its own success: portability led to ubiquity. That meant architecture didn't matter, so there's only one."

"Government funded and corporate research is directed at very fast 'return on investment'... The metric of merit is wrong."

"Measure success by ideas, not just papers and money. Make the industry want your work."

"The future is distributed computation, but the language community has done very little to address that possibility."

My take on the lessons learned, again in the form of quotes:

"Keep the ideas flowing, even if the implementation is not feasible (using existing systems)."

"When thinking of distributed systems -- think beyond web, Browser and Flash Player"

"Something is popular, does not mean it is correct or best way to do that thing."

"Do not publish papers that fake measurement as research."

"Do not take a job that you are not truly motivated about."

"Writing software in Java is like writing detailed machine instructions. Learn Python instead."

REST, Flash Player, Flex

In doing some experiments with Flex and RESTful architecture, I realized that there seems to be a whole lot of problems. Flash Player was designed to support traditional HTTP web access such as form posting or resource retrieval using GET and POST. So a number of features that are used in RESTful design are not supported by Flash Player. People have written additional client-side or on server-side kludges to work around the problems with HTTP support in Flash Player. Most of the server side changes are hacks, and the client-side changes are in external third-party libraries, which are sometimes missing crucial features like cookies, TLS, etc. Even in the best possible scenario, you still need to provide crossdomain policy file even if the Flash application is accessing resources on the same server. One of the reasons I suspect is that Flash Player relies on the browser for HTTP support, and hence supports only the lowest common set of features, which are not enough for REST architecture.

What is the solution? Depends on what you want to do.
  1. If you have control over server-side of the system, you need to incorporate certain kludges to support Flash restrictions. For example, provide crossdomain policy-file, map some header or URL to method PUT and DELETE, return 200 success response with message body containing actual error code (e.g., 404 vs 405 vs 501) or headers. Any of these techniques looks like a hack at best.
  2. If you have control over the client, you can use an existing third-party RESTful client library in ActionScript. However, be prepared to provide a crossdomain policy-file or incorporate a proxy. Alternatively, you can also perform some Flash Player-specific translations on the fly in your proxy.
  3. Use Flash Player's ExternalInterface mechanism and incorporate your RESTful client code in JavaScript. This is sometimes not easy, error-free or feasible. Moreover, now your Flash application depends on your Javascript.
What are the problems with these solutions?
  1. You cannot build a general purpose RESTful web application (server-side) and expect Flash Player application to use them. You will need to be Flash Player-aware.
  2. You cannot build a general purpose Flash application (client-side) to use an existing RESTful web service without additional dependencies and network elements (proxies).
What is the real solution?
  1. It looks like HTTP+REST won't really be able to solve all the problems in ActionScript for Flash Player without Adobe's blessings, e.g., incorporate full HTTP stack in the Flash Player or provide a different way other than crossdomain to do authentication of scripts.
Some additional references [1, 2, 3, 4]