I just conducted a review of the first 70 results from Google on the question “Why use a NoSQL database?”. In this post I show you the results in the for camp, and the against camp.
The results are quite interesting. Firstly though I’ll tell you how I got them. I discounted any obviously biased sites. E.g. Vendor websites. I also ignored duplicate story posts. I then read through each page’s content to see why people recommend NoSQL databases. It should be noted that these results are what people on the web believe, rather than what may or may not be true today. This is what makes it interesting! I discounted multiple mentions of the same item in each article. I also discounted definitions. E.g. listing a graph database as a type of NoSQL database does not result in hits on ‘Because NoSQL can handle graphs’.
Reasons to use NoSQL databases
11 Scale (horizontal)
7 Simpler data model (less joins)
6 Schema less (no modelling or prototyping)
4 Rapid Development/coder friendly
3 Flexibility / semi-structured / unstructured / structured
3 Cheaper than relational / commodity
2 Creating a caching Layer
2 Wide data type variety
2 Large binary objects
2 Environment data / logs
1 Bulk upload
1 Lower administration / less DBAs
1 Distributed storage
1 Real time analysis
This is interesting. If you group together the sections that deal with ‘ease of application development’ – which are ‘simpler data model’, ‘schema less’ and ‘rapid development’ – then these come to a total of 17. This far outweighs any other result. It would seem that most people use NoSQL databases because they’re easier to develop application on.
This likely accounts for why you come across Open Source NoSQL databases on the web, but not in large organisations doing serious work – for a large and complex system ease of development is not a primary concern. These organisations are instead concerned with ‘doing things properly’ – i.e. Enterprise class solutions.
Why use relational instead? / challenges against NoSQL
3 Can’t do SQL (But SQL can do XML)
3 Analytics / BI / Reporting
3 Can’t do search
1 data loss
1 Remove duplicate data (Normalisation)
1 Referential integrity
1 Expertise availability
This section was interesting for me because I expected ‘Integration woes’ to come higher. I suppose this is because integrating a search engine with a RDBMS is as much of a pain as it is for NoSQL databases that don’t provide this out of the box. It could be argued that this is reflected in ‘Maturity’ and ‘can’t do search’.
Also of interest is the belief that these Open Source tools have neither a large pool of expertise or a solid ecosystems of tools, addons, or support. This is probably true. There is a lot of information out there on the web for these NoSQL databases, but very little of it in the same place, in guides that are task orientated and easy to find and understand.
I was surprised that ‘ease of application development’ is a core reason for looking at a NoSQL database. I was expecting to see a list of domain problems rather than purely because it’s simpler, or the application generally has high volumes. It seems the reasons for using a NoSQL database are as general as the reasons to use an RDBMS.
Happily, it seems I’m working at the right place here at MarkLogic. Our NoSQL database already provides all of the ‘NoSQL needs’ listed, except for Graphs. We also answer more of the ‘Why NOT’ questions ourselves too, thanks to the amount of Enterprise features and customers we have. Sure, we could be easier to use and get started with, but our API team are working hard on this, as are consultants like myself with the likes of MLPHP and MLDB wrappers for our fantastic REST API in MarkLogic Server V6.