API Design Best Practices. How to attain API Awesomeness.
API Design Best Practices
API’s can be fundamentally important for an organization. However, its use, contrary to popular belief, is not just for external clients & developers, but they can play an important role in building system wide applications & towards an API driven architectural style(more than in a separate blog).
So what are the good qualities to think about while designing APIs. In other words, I want to try to discuss, how to design APIs that are well designed, scalable (in its use) and some thoughts around related matters.
Qualities of good APIs
- Less is more
- The less the API set, the better.
- There are some obvious benefits here. Less means less to build, support, maintain. Its also supposedly easier to understand by API consumers, developers. Its easy to build off them as well.
- Sometimes there is a need for more powerful, custom APIs. In that case, my genius idea is to build another “set” of custom APIs on “top”.
- Ok, so what does that mean? Well, lets take an example. I have 3 simple APIs to offer to the rest of the world and to developers within my company. Now, I will offer another set of APIs that are built on “top” off these APIs with additional hooks and customizations. These APIs are designed for the power users, if you will. They are documented as such in a separate section. Now, what are the benefits of such an approach?
- Simple - isolation. Which allows us to “evolve” one set of APIs separately than others. This might not seem obvious, but believe me when I say, its tremendously powerful. I can add, deprecate a set of APIs, add/remove params without impacting my entire user base. This is awesome.
- I don’t know of any organization, large or small, formally following this. They might have such a pattern, but it often is an artifact of iterations, vs planned. Simply following this rule, will make you a better API designer, architect and developer.
- Simple - isolation. Which allows us to “evolve” one set of APIs separately than others. This might not seem obvious, but believe me when I say, its tremendously powerful. I can add, deprecate a set of APIs, add/remove params without impacting my entire user base. This is awesome.
- Ok, so what does that mean? Well, lets take an example. I have 3 simple APIs to offer to the rest of the world and to developers within my company. Now, I will offer another set of APIs that are built on “top” off these APIs with additional hooks and customizations. These APIs are designed for the power users, if you will. They are documented as such in a separate section. Now, what are the benefits of such an approach?
- The less the API set, the better.
- Loosely coupled to clients, possibly RESTful, platform agnostic
- Architects have learned the hard way, the cost of building tightly coupled APIs to clients. And I understand why they did that. APIs initially were needed to solve a problem, offer services to a set of clients and hence its natural for API designers to follow that client needs. Wrong 8/10 times. Clients come & go. That’s a fact of life. Both internal & external clients change, because they are often product focused and no product is constant, at least the successful ones. DON’T design your APIs for a client or a smaller set of clients. You will be amazed, at the possibilities of beautifully designed APIs. Systems are better built with loose dependencies on APIs as opposed to tight binary dependencies.
- REST is a very powerful pattern, that the web/http builds upon and you can not go wrong building RESTful APIs, but there are some gotchas. Don’t follow 100% REST terminology as chances are you will not have web scale routers and caching infrastructure. There is often something more needed. Also some of REST philosophy is hard to absorb. It’s OK. Start small, simple. Iterate from that point on. Learn from giants - Twitter, Facebook, LinkedIn, Google.
- Performant
- APIs are no good if they are not fast, simply put. Clients, often external, are dependent on them. Their performance, their user experience is dependent on your shoulders. And if there are 1000’s of clients, that’s a lot of responsibility. Don’t sweat it. Only expose APIs in the beginning that you know follow good algorithms, example does not involve an entire sweep of the database. APIs should mostly use keys to look up data, cache hard to calculate data and avoid user specific complex data to be served without designing & planning for it. Caches are API’s best friends. Take advantage of them. Assume clients will try to abuse them. You have to be smart, accountable & resilient.
- Keep responses small. Large responses are one of the biggest reasons some APIs are slower. Support pagination if there is more data. Simple. (See more tips in the response format tips below!)
- Consider supporting binary response formats for extra perfomance. msgpack, protocol buffers are excellent for compacting your API response and at the same time, supporting fast data parsing and loading, a great win!
- Don’t trust your clients
- This is something novice designers & architects always make. Trusting the clients to do the right thing. Wrong again. Never, trust your clients enough. Even if they are internal clients. Because even if the intentions are good, a small client application bug can suffocate API’s which in turn has down stream impacts.
- API servers are API servers
- API servers are often aggregators. They don’t have a lot of business logic, but instead depend on other systems to pull and present the data to the clients. Often from a variety of back-ends. If you are not doing this, in other words, designing or building APIs on your app server, well good luck. I will say, its a terrible idea. Even if you entire stack is hosted off one-machine, try to isolate the api server as a separate module, with its own dependencies and code base. There is only binary/library/API dependency on other modules/system components. You will thank me for this.
- Param Design
- Keep your param list small, very small for that matter. Give good names, so that 9/10 times I get it without reading a lot of documentation. These are for obvious reasons. As a rule of thumb, allways choose simplicity when possible. As API designers, this is the hardest aspect. As computer scientists, we are trained to solve complicated problems, but we are not trained enough on simplicity, the power of less, and all that. Just look at Apple products & user experiences.
- Clearly mark off optional parameters and document them separately, because many developers might not even be interested in that much power. Also, on the flip side, clearly note your default values for optional params.
- If you are designing HTTP APIs, follow the proper verbs. GET for data out. PUT/POST for data in or updates. DELETE for deleting data (use caution here, especially around authentication).
- Response design & formats
- Support only response format to the extent possible. Don’t try to be cool and support multiple formats unless absolutely required. JSON, these days offers a good balance of simplicity and client compatibility across the gamut of clients out there. I don’t prefer XML, but it works.
- If you need to support, more than one response format, try to isolate the view (the response template) from the data. This is inline with MVC design pattern. This will make it very easy to support & maintain multiple formats. And its easy to debug issues as well. Please note this!
- Keep responses small. Don’t include everything in the response.
- I have a great tip, that many don’t know. Use params to drive what to include in the response. This is a great way to give the control to the client and let them decide the tradeoff between performance and quantity. You should make this clear in the API documentation. Also, this can be designed as a priviliged service. Meaning, only priviliged (trusted, etc) can choose to include certain items in the response as they add to the response and/or cost more to calculate/generate.
- Real-Time Processing for Statistics & Monitoring
- Avoid doing real-time processing to the extent possible that is in regards to monitoring or calculating stats or for monitoring purposes. That’s because often times its not very simple to that in a fashion that does not compromise the API performance, and the integrity of the statistical data and taking other necessary actions.
- I have a much better idea to share for that problem. Do processing offline! What I mean by that is that you should log all the requests you get for the systems. Periodically process that data and calculate aggregate stats that the APIs can use to monitor and accept/deny/etc activities from clients. This is amazing powerful because it is very scalable. Offline systems can use hadoop/map-reduce on streamed scribe data and there you go, you can calculate the most sophisticated or the simplest of statistic, like API counts from a client or partner. Imagine, doing that in real-time, especially when you have 20 API servers and any one of them can serve API request. You have no choice but to use a distributed storage system and/or cache locally and periodically sync with other peers.
- Instead, doing it offline and updating the live API servers with that stat or making it available in constant time lookup (key lookups from caching systems or KV data stores for ex Redis) does wonders. Sure, there is a downside. You will loose a window of opportunity when the stat is not updated and you would still be serving clients. But you can have the offline batch processing as often as you like. For ex. every 15 minutes. That way, the worse you loose is a 30 minute window. And besides, you should be investigating DoS attacks among other things at the site level, not just APIs, and those systems can help/aid under critical attack or abuse situations.
- Authentication
- Coming soon!