Although NoSQL databases have been gaining popularity over the years, the idea behind them isn’t really new. What’s changed is the availability of new solutions and their improved reliability and performance, leading to expanded use from a niche audience to a broader one.
Nowadays, especially thanks to this broad range of tools for developing robust database systems, we shouldn’t stubbornly stick to our preferred solution simply for subjective reasons, but instead make use of the best tools available for a specific task or project requirement. Sometimes, your best bet is a combination of relational and NoSQL databases because one often complements the other.
The Difference Between the Two
Relational databases require you to structure a database into tables and then each table into columns according to data types. The relational part comes in with defining certain columns in a table as foreign keys of another table. That way you create links between entities. This is a great approach, but it doesn’t allow you to restructure your data on the fly, to remove some columns without losing data, or to add columns without migrating all the previous entries to a new schema.
NoSQL databases, however, don’t require you to stick to any predefined structure. Just enter data in a form and it’s stored. This applies to a lot of use cases, some of which we’ll go over later, but it also presents a potential problem. The freedom to add any type of data to the database means that some developers don’t actually design the database. Instead, they “add stuff” and worry about that stuff later. Except later, when they need to review or analyze data from their databases, they end up with problems. For example, they’ve renamed a field five times and now they have to check each entry for five fields. An additional downside of NoSQL databases is that they require more storage than relational databases, and you need to manage all the relations.
Hybrid Use Cases
There are certain scenarios where it would be wise to either add NoSQL databases to an existing relational database or vice versa. The reasons vary from just wanting to cache the reads to enhancing your queries to scaling your data more easily across servers. Here are three use cases for a hybrid setup.
Use Case: Document Database
ERP solutions are historically a stronghold for relational databases, but they’re lacking the flexibility to enable their users to customize entry forms without updating the database schema. By adding a NoSQL document database into the system, users can create and edit forms quickly, as needed. The data will be stored as documents, and it will be future-proofed for any form parameter changes moving forward. Some relational database vendors have recognized the need for such a blended solution and implemented something similar to a document database inside of their relational database. Microsoft SQL Server 2016, for example, offers support for storing JSON documents inside cells, which does ease up workflow but also complicates updating that data compared to updating data in a normal table.
Document databases store everything in a form of a “document,” usually a JSON object. Because they don’t require any structure, you can add different fields to each JSON object, while keeping in mind that it’s up to you to make sense of that data when retrieving it. Popular document databases are MongoDB and Couchbase.
Use Case: In-Memory Database and Graph Database
The success of e-commerce sites relies heavily on their ability to recommend something that might interest you in particular. How do they do this? They analyze your previous purchases and track the items you’ve watched, but didn’t purchase. They’ll do the same for your friends and for other users in your region, and they’ll then correlate all this data with what’s trendy. The challenge is that this data analysis must happen quickly for each page opening and each user, an impossible feat if you need to query your relational database and join together multiple tables in order to get results.
A solution could be to add an in-memory database in front of your relational database to cache all the data needed to perform queries in-memory, instead of going to the disk each time. An even better solution would be to also add a graph database to keep track of all your relationships as a user regarding your preferences as well as who your friends are and their preferences.
In-memory databases mostly run in your RAM, but some of them have the ability to persist data to the hard drive and offer replication support, snapshots and transaction logging. Memcached and Redis are the most well known in-memory databases. Graph databases store their data graph structures, and they’re optimized for fast querying and lookups. That’s done by adding a pointer on each entry to their “connected” entries. Check out Neo4j and InfiniteGraph.
Use Case: Fraud Detection
Whether you’re running an online shop or a brick-and-mortar retail store, it’s important to continuously be on the lookout for fraud attempts. To do that you need to rapidly log a lot of data from various parts of your system. Naturally, because the data is coming from many different places—think of your web servers, your file servers, or credit card payment gateways—and it’s not structured in the same way for each, it would be very difficult to design a relational database for this purpose. Also, there’s a chance that over time you’ll start or stop logging some parameters somewhere in system, and you need a database that can handle that. Column databases were designed with this purpose in mind, and they provide you with fast writes, but you need to take care while designing one to make sure it fit your needs.
Column databases are designed for read and write performance, large volumes of data, and high availability. They are intended to run on clusters of servers, so if your data is small enough to fit on a single server, you should consider using another type ofdatabase. Bigtable and Cassandra are the most popular column databases.
Scaling the System
Scaling relational databases across multiple machines requires you to shard your database, which isn’t a trivial undertaking, but it’s a well-trodden path. Scaling NoSQL databases is another task entirely. They’ve been designed to scale easily across multiple machines, and sharding data is something they do without a problem. Getting the two to scale together is somewhat of a challenge for DB admins and for sys admins, especially when you need to add search engines and similar “add-ons.” The required process depends heavily on your environment and requirements.
OrientDB is a multi-model open source NoSQL DBMS that combines the power of graphs with documents, key-value, object-oriented and geospatial models into one scalable, high-performance operational database. Because it’s designed to be a distributed solution, and because it “packs” in it various types of NoSQL databases, you cover all your NoSQL needs with just one solution. The only thing that’s missing in OrientDB is a relational database.
In addition, the performance of RDBMs in Amazon Cloud, we discussed the great benefit of elasticity. When looking at the infrastructure support of your database scalability, you can scale up a single machine size to support the growing demand of your database.
Final Note
Even relational database vendors are beginning to see the benefits of NoSQL databases, and they’re trying to find a feasible way to incorporate them into their relational databases so that customers can have everything available out-of-the-box. The main thing to remember is that NoSQL solutions are production-ready and useful in certain situations. That said, relational databases are still relevant, and developers won’t be abandoning them so quickly. So give them both a chance.