What is a NoSQL database and what are they used for? – CloudSavvy IT

A NoSQL database is any type of database that breaks with the traditional conception of SQL. NoSQL databases like document-based MongoDB have become more popular in recent years. Why all the hype?

SQL limits: scalability

SQL has been around forever, 45 years. It holds up surprisingly well, and modern SQL implementations are very fast. But, as the web has grown, the need for powerful databases grows to meet the demand.

The easiest way to scale an SQL database is to run it on a more powerful computer. SQL databases can be replicated to reduce regional load on an individual instance, but splitting a table (often called burst) is much more difficult for SQL.

Document-based NoSQL databases solve this problem by design. Each document is independent of other documents in the collection, so collections can be distributed much more easily across multiple servers. Many document databases will include built-in tools to share data on different servers.

But the scalability issue isn’t much of a problem until you have parcel of data. You can easily run an SQL database with hundreds of thousands of users and have no issues, assuming your structure is solid and your queries are fast.

MySQL and MongoDB will likely do the job for your application, so choosing between the two depends on your preferred structure and syntax. Ease of development is important, and you might find that MongoDB’s much newer document model and syntax is easier to use than SQL.

NoSQL vs. SQL structure

Traditional SQL databases are often referred to as relational databases because of the way they are structured. In an SQL database, you will have multiple tables, each containing multiple rows (called recordings), which themselves have several different columns, or attributes. Each separate table is linked to the other by a primary key, which forms a relationship.

For example, imagine you have a table with each record representing a post made by a user. The primary key here is the username, which can be used to link the message table to the user table. If you wanted to find the e-mail of the person who posted the message, you would need to search for “Jon1996” in the users table and select the “E-mail” field.

But this data structure may not work for everyone. SQL databases have a rigidly defined schema, which can get in your way if you need to make changes or just prefer to have a different layout. With complex data sets, the relationships between everything can become more complicated than the data itself.

The main type of NoSQL database is a JSON document database, like MongoDB. Instead of storing rows and columns, all data is stored in individual documents. These documents are stored in collections (for example, a “user” document would be stored in an “all users” collection) and do not necessarily have to have the same structure as the other documents in the collection.

For example, a “user” document might look like this:

{
  "username":"Jon1996",
  "email":"jon1996@gmail.com",
  "posts": [
   {"id":1},
   {"id":2},
   {"id":3},
  ]
}

The username and email fields are just key-value pairs, similar to columns in SQL, but the “messages” field contains an array, which you won’t find in SQL databases. Now suppose we have a collection of messages with documents like:

{
  "id":1,
  "title":"First Post",
  "content":"Hello, World!",
  "madeby":"Jon1996"
}

Now when someone visits Jon’s page, your app can fetch three posts with IDs 1, 2, and 3, which is usually a quick request. Compared to SQL, where you might need to fetch all messages that match Jon’s user ID. Still pretty fast, but the MongoDB request is more direct and makes more sense.

What are NoSQL databases used for?

NoSQL is a broad category and includes many types of databases built with different purposes. Every database is a tool, and your job may require a specific type of tool, or even several different tools.

SQL databases like MySQL, Oracle, and PostgreSQL have been around since before the internet. They are very stable, have a lot of support, and can generally do the job for most people. If your data is valuable to you and you want an established and consistent solution, stick with SQL.

JSON document databases, like MongoDB and Couchbase, are popular for web applications with changing data models and for storing complex documents. For example, a site like Amazon may often need to modify the data model to store the products on the site, so a document-based database can work well for them.

Document databases are meant to be the generic replacement for SQL and are probably what you think of when you hear “NoSQL”. They are also more intuitive to learn than SQL, since you won’t have to deal with relationships between tables or complex queries.

Rethinking the database is a JSON document database designed for real-time applications. In a database like MongoDB, you have to poll for updates every few seconds or implement an additional API to track updates in real time, which quickly gets cumbersome. RethinkDB addresses this issue by automatically serving updates to web socket streams that clients can connect to.

Redis is an extremely powerful key-value database that stores small keys and strings entirely in RAM, which is much faster to read and write than even the fastest SSDs. It is often used with other databases as an in-memory cache for small data that is often written and read. For example, a messaging app might want to use Redis to store user’s messages (and even send real-time updates with their Pub / Sub Methods). Storing many small messages in this manner can cause performance problems with other types of databases.

Graphic databases are designed to store connections between data. A common use case is social media, where users are connected to each other and interact with other data, such as the posts they have posted.

In this example, George is friends with two people, Jon and Jane. If any other type of database wanted to figure out George’s connection to Sarah, they would have to interview all of Jon’s friends and all of Jane’s friends. But graph databases understand this connection intuitively; for the query of friends of friends, the popular graphical database Neo4J is 60% faster than MySQL. For friends of friends of friends (3 levels deep) Neo4J is 180 times faster.

Extended column databases like Cassandra and Hbase, are used to store massive amounts of data. They’re designed for such large datasets that you need multiple computers to store it all, and they’re Faster than SQL and other NoSQL databases when spread across multiple nodes.

Leave a Comment

x