Thursday, December 10, 2009

Document Stores: Please Give Me A Standard API

Although I'm a long-standing RDBMS guy (having worked on Broadbase, Kidar, and Eigenbase/LucidDB [indirectly through Kidar and Broadbase]), I'm quite excited by the emerging document-oriented database movement. While I'm not a pure "SQL is bad and old-economy and you should throw it away" guy, I do think that document stores, like their Hierarchical Database precedents, have their uses in modern architectures. In particular, practical (as opposed to theoretical) aspects of the use of a hierarchical/document model allow for advanced scalability and performance optimizations to be made for modern scale-out architectures.

That being said, even though we're at early stages, I think the major proponents of the technique need to learn from the RDBMS guys in one important aspect: unified APIs are key to widescale adoption.

If I'm writing an application that's going to be backed by a relational database, if I'm in a sensible programming language, I've got a standard API that I can code against: JDBC, ODBC, ADO.NET, et. al. It doesn't shield me entirely from differences in the underlying database implementation (or else there would be no opportunities for product differentiation), but it makes those differences minimal and relatively easy for a software developer to abstract.

Ditto for message oriented middleware: I can use JMS at the code layer, or AMQP at the network layer (and thus at the code layer as well). While different MOM implementations have underlying differences, which are particularly obvious if I want to push the technology to its limits, the product-specific differences are noticeable in the breach rather than in the general.

This isn't the case right now for document stores. I know that MongoDB and Riak and CouchDB and SDB and others (which I'm sure commenters will point out below) are pretty darn similar in their functionality. I know that the conceptual models are relatively similar. I know this logically. But I still have to do custom code for each one for my own application.

With multiple implementations out there, and with users (e.g. me) looking at the different systems and seeing them logically similar, it appears that it's probably time for the teams to start working together and come up with a code-level API that I can code against, much like JDBC or JMS. While this might seem like early stages for such an effort, trust me, it'll greatly lead to increased adoption because the perceived costs of evaluating different implementations will be greatly reduced.

It doesn't mean that you can't differentiate; it doesn't mean that you can't be superior or inferior to other implementations. But it does mean that I, as a consumer of these systems, can more easily support multiple implementations. And if I can do that, I'm more likely to move to one in the first place.

blog comments powered by Disqus