<< Transferring Complex Data to a Reporting App | Home | A short explanation of the NoSQL data model. >>

Rationale for a common query language for "NoSql Databases"

Many projects could benefit from NoSQL, but need a simple query language.

NoSQL database systems have originated from very specialized applications (e.g. large scale web communities), but now their advantages over the relational model bring them into more general business uses.

Relational databases have benefitted from the standardized query language SQL. It has held on for 40 years, but SQL always had its weaknesses.
However SQL has proven very successful, because it created an ecosystem of supporting tools, which is important for business users.

NoSQL databases lack a common query language, that can provide the basis for a vendor independent tool ecosystem. I believe such a tool ecosystem is important to enable more widespread acceptance of this technology. It would allow to base tools like ETL or reporting on this common language.

The first question that arises is whether such a common query language is feasible at all, and if yes why not just SQL.

SQL can not be used as NoSQL databases have richer data models than relational databases, and because SQL depends on the join mechanism which NoSql (distributed ones in particular) try to avoid. Leaving SQL behind offers the prospect of significant cost reductions through both reduced developer effort and server power.

At the moment the new database systems nave no query language (completely depending on the highly specialized map/reduce approach), while others like (e.g. MongoDB or Versant) have rudimentary (and proprietary) query languages.

The feasability of such a language depends to a lesser degree on the storage model of the database system, but more on the structure of the data to be stored.
I.e. if we store typical business models in a NoSQL database we need a query language that is adapted to these models. Since most data these days is modeled using objects (E.g. in UML, XML or Java/C#) I envision a query language, that is capable of querying such models.

Languages like OQL, XQuery and Linq have sufficient expressive power, but they are more complex (unreadable) than necessary or lack certain functions.

If you are interested in such a language, follow this blog, so that you can see it emerge and that you can comment on its strengths and weaknesses.

Or do you have a candidate language, that could be used?

Re: Rationale for a common query language for "NoSql Databases"

 While I generally agree that a common language will be good for the NoSQL space, I'd say it is too early to have something like that. In fact I've posted more thoughts about this subject on the post document database query language.

Re: Rationale for a common query language for "NoSql Databases"

The <a >TinkerPop guys are doing some interesting things in this area. Look for example at the <a >Gremlin graph language and <a >Blueprints for generic graphdb and docdb data models and their implementations. 

Gremlin as a candidate language

Hi Anders,

thank you for the link.Yes, I am aware of Gremlin.

As a graph querying language, or if you are a programmer, I think it is good.

However I doubt, that their language is usable as a general purpose query language. It is based on XPath, which makes it hard to read.  It also has a focus on graph walking, which I believe is not typical for business applications.


Common language requires common model

There is not one NoSQL database model, but at least (1) key/value-stores (2) graph databases, (3) document databases, and (4) column databases. For (1) you don't need a complex language - something like "POSIX light" would be nice, as key/value stores much compare to flat file systems. For (2) there is gremlin, which is fine. What is missing is a common language for (3) and (4), but this requires a common database model first. A simple start is a "tree model", which makes XPath or JSONPath a good candidate to start from.

Add a comment Send a TrackBack