The relational database model began to emerge in the 1970’s and quickly gained traction because of its capabilities to store and manipulate data during that time. RDBMS is still the predominant technology for storing structured data in web or any other business applications.
Everything is going great and now all of sudden we have need to handle large amounts of data structured, unstructured due to digital eruption etc., etc. Now the biggest question that should be asked is “Can the Relational Database Systems scale horizontally without compromising on performance and high availability?”
The answer to this is quite obvious and we don’t advocate dismissing relational databases, but instead see value in using the right data store for the job. We always have been taught to use the right tool for the right job, haven’t we?
But what is the real need for a system that can scale horizontally? For example, a website may feed the same answers to hundreds, thousands or even millions of users at the same time spontaneously. You may already thinking of Facebook, Twitter etc. If we were to perform this stunt with a traditional RDBMS, we would be taxing the database to recompute the same thing over and over again and now you can imagine the rest of the complexity and sleepless nights for the developers and DBA’s. Most NoSQL data stores provide a solution that can handle these types of situations with ease.
Even though RDBMS offer rich language, are easy to use, integrate rich tool sets and scale vertically, they are considered expensive. We should also not forget the fact that RDBM’s are built on the ACID Philosophy (Atomicity, Consistency, Isolation, and Durability). Lastly, the frequency of read-writes is considered poor in this modern world of digital data eruption.
Vs
NoSQL approach includes simplicity of design, horizontal scaling and finer control over availability without compromising on performance and they are based on CAP Theorem as shown below.
That said and done, reviewing NoSQL database offerings is a more difficult task than comparing and contrasting relational RDBMS, because there are more than one type of NoSQL databases and a large number of individual NoSQL DBMS’s.
There are four main NoSQL categories– Document oriented, Key-value oriented, Column oriented and Graph database. Let us consider column oriented and analyze the types of applications that can be built.
Column Oriented: When you need faster access to bigger sets of data such as your Data warehouses/Data marts etc.
Relational databases are severely taxed on performance aspects when searching large volumes of data. Historically, database developers /DBA’s have built data models and write complex SQL queries to find a few rows of data which involved thinning-out data sets in the most efficient way. The bigger the result set, the more expensive the queries become and data warehouses typically required aggregating large amounts of data for Reporting, Dashboards, and Scorecards, etc. This continues to a consistent problem and after all the drilling, the wins are minimal in query performance.
In this context NoSQL Column oriented data stores are becoming widely adopted and are being tested to build their new data warehouse applications and this process is often referred to as modernizing the data warehouse these days. Lastly, this type of setup will help you go towards “Less rigid consistency requirements”.
NoSQL is considered an alternative to traditional relational databases because certain requirements that are inherently part of relational databases are very different in modern enterprises.
Most enterprises are discovering that certain data requirements does not demand the rigid ACID model of RDBMS usually comes with worse performance. Instead they can meet their needs using eventual consistency that tends to come with better performance.
There are several NoSQL data stores are available; the following are the most popular:
- Cassandra: Column oriented database that provides scalability and high availability without compromising performance.
- MongoDB: Open-source document database.
- CouchDB: Database that uses JSON for documents, JavaScript for MapReduce queries, and regular HTTP for an API.
- Hbase: Hadoop’s column oriented database, a distributed and scalable big data store.
- Neo4j: Open source high-performance, enterprise-grade graph database.
Finally, most of the NoSQL data stores have enterprise class distributions from companies such as DataStax, Mongo DB, etc. So why not give it a shot with NoSQL for your next data adventure?