It's no secret that we are more dependent on technology than ever. Smartphones aside, we have a handful of devices -- laptops, wearables, smart home appliances, even the thermostat (that's you Nest) -- that are connected to the internet. And when I say connected to the internet, I mean they are all connected to the same service (say, Google account). Now, think of a problem where we need to store users' activities, the kind of things they like, what they searched for in the previous 24 hours, or 48 hours or even last month. How can we store such a versatile data? It sure is overwhelming, and I can assure you SQL (Relational Database) is not the answer here.

Typical SQL databases work like this -- they need a fixed schema, say the schema contains user device id and activity name. But where will the ever-changing timestamp go? What happens when we see a new type of user data altogether? How will we adapt to that with the traditional Relational DB?

Well, the answer is we can't really. We can try to limit the application's behavior to fit our schema but is that how current applications work or rather should work? Not really. We need a better solution than relational database -- a schema-free database, a NoSQL database.

A NoSQL database is a schema-free database which unlike traditional databases, stores data entries in documents instead of tables. So we don't get tables to work with, there are no lengthy joins or columns or rows, we have data entries that are responsible for holding all of the data for a particular unique id. Also, NoSQL means Not-Only-SQL, so expect to get a similar behavior as SQL but better in terms of performance, scalability and more. 

There are a few good NoSQL databases available today, like DocumentDB from Microsoft, DynamoDB from Amazon (AWS) and MongoDB from well, MongoDB Inc. Out of the three (and others that were not included), MongoDB has been making strides lately thanks to growing adoption from companies like Facebook (Parse), Expedia, Bosch and more. I am sure the answer to why they are adopting NoSQL won't be surprising to anyone. MongoDB can make complex data structures easier to hold and query and with Node.js, it flies. Let's see how to begin with everything MongoDB and how to do different operations in Node.js.

0. Installing MongoDB

When working with any database, there are two steps involved. 1. Get the database installed on the system. 2. Connect to the local database using a driver and do the operations.

Let's first see how to install MongoDB on our desired system. We will safely ignore MacOS installation since it isn't used on servers.

MongoDB Inc. offers two editions of MongoDB -- Enterprise and Community. Read about what they offer with enterprise edition here and with community edition here and decide which one to install before proceeding. We will install community version, so make changes if you choose to go with the other one. Also, keep in mind we will be going with the complete MongoDB package (mongodb-org), so if you need any other, specify that package name in the commands below.

 On an Ubuntu system, we need to run the following 4 commands to get MongoDB 3.4 installed,

--  Importing public key

     Ubuntu's APT package manager needs GPG keys that the vendor used to sign package. The following command imports MongoDB's GPG key, 

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6

-- Creating source list

    We need to create a source list for MongoDB repository before we can ask APT to install it. We will create a list named mongodb-org-3.4.list to hold the repository address for the apt package manager to access. On Ubuntu 16.04 the command to do that is,

echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list

On any other version, just replace the ubuntu xenial with version name, like ubuntu trusty for Ubuntu 14.04.

-- Updating Package Manager Local Database

    After creating the list we need our package manager to know of the existence of a new repository source. We do that by updating it using,

sudo apt-get update

-- Doing apt-get install

This will install the default package mongodb-org that contains all 4 sub-packages, mongodb-org-server, mongodb-org-mongos, mongodb-org-shell, and mongodb-org-tools.

sudo apt-get install mongodb-org -y

Here, -y represents our approval to any permission needed during installation. Okay, we have our database installed, now all we need to do is set the data storage path and start the daemon service. Doing this will start our MongoDB daemon,

mongod --dbpath /path/data

The daemon will start running on port 27017 with the value of dbpath as default storage directory.

For any other version of Linux go here for complete instructions and for Windows go here.  

1. Installing MongoDB driver for Node.js

Once done with the local installation, we will need our Node.js application to connect and access the databases. We will assume you have, at least, a good understanding of the basics of Node.js. Like any other database system, we need to have the MongoDB driver installed and configured in Node.js to access the databases. Lucky for us, MongoDB offers drivers for a handful of languages and Node.js is one of them.

In order to install our MongoDB driver, we will need NPM (Node Package Manager). If you're running Node.js version v0.6.3 or above, you already have NPM installed. On versions before that, you need to run the following command (for Ubuntu)

sudo apt-get install npm

On Windows, there is no extra command needed since all of Node.js executables include NPM by default.

Okay, now that we have our package manager installed, we can install MongoDB by running the following command,

sudo npm i mongodb --save

On Windows, just remove sudo from the command above, the rest remain same on both. We need to include sudo on Ubuntu systems to give NPM installer required access (write, for example). On a Windows system, everything should run just fine, or just run the command prompt as administrator.

The rest of the command is the actual NPM command where we are directing the NPM package manager to install, i, the required MongoDB module for us. Here, the last parameter of the command, save, represents our willingness to add the to-be-installed module name to our project's package.json file for future re-use. You can see the module installed in the node_modules folder.

2. Creating a Database for Use

Before connecting our Node.js application to MongoDB, let's first create a manual database, say MusicDB, using MongoDB shell. To open the shell, just run,

mongo

on Ubuntu. Now, to create our MusicDB database, just type,

use MusicDB

and hit enter. MongoDB handles both changes of default database and creation of new databases using use. If the system fails to see an already created database of the name provided, it simply creates a new database of that name and sets it as the default database for use. However, you must understand that a database won't be shown in the shell until it has some data in it. You can use

show dbs 

or 

show databases

to list all the databases present on the local disk.

So, now that we have our DB initialized, we need to create a collection, say albums, and insert our albums' data into it. Think of Collections as the No-SQL counterpart of Tables -- rows are represented by data entries and columns by entity properties. Let's add the following entry as our album data for Alan Walker's Faded EP,

db.albums.insert({name: 'Faded-EP', artist: 'Alan Walker', releaseDate: '25 Nov 2015', genre: 'Dance'})

Look closely at the code. There are 5 parts that work together to put our data entry into the collection and hence, the database.

First, the db variable. It represents the default database that's in use currently and can be used to manipulate or access data in collections. Here, we are putting data into albums (the second part) collection and thus, we have accessed it from DB. The next part, insert is a method provided by the collection interface. It simply takes a data entry as JSON object and stores it in the collection it is a part of. Like, we have got the Faded EP album data entry stored in albums collection by passing the different properties of the album as a single JSON object.

The fourth and fifth parts are the JSON properties' names and values. It is important to have a good understanding of the type of data being stored in the Collection -- is it Number, or String, or Date or anything else? -- to make sure we maintain consistency about the data type for a single attribute since MongoDB can accept multiple data types for the same property. For example, we could have passed a Date instance instead of String for releaseDate property in the second data entry for some other Album and MongoDB would have been fine with it. Make sure you check for such redundancies in MongoDB to ensure consistency but, only if needed.

3. Using Node.js driver to connect to Database

In order to connect to our database, MusicDB, we first need to get an instance of MongoClient which is provided by MongoDB package. Doing following will get us our MongoClient instance,

const MongoClient = require('mongodb').MongoClient

This MongoClient instance is going to help us connect to MongoDB running on localhost (or local computer) on default port 27017 and access a database. We will use MongoClient's connect method, which expects two parameters -- database URL and a callback with our DB instance, to connect to MusicDB. The code below takes in the URL and throws error and database instances. We will proceed to access the database only if error instance is null or we throw the error.

MongoClient.connect('mongodb://127.0.0.1:27017/MusicDB', (err, db) => { 
    if (err) throw err 
    db.collection('albums').find({}).toArray((e, albums) => { 
        db.close() 
        if (e) throw e
        console.log(albums)
    })
})

You must have noticed something unusual in the URL above -- there's MongoDB protocol in place of regular HTTP or HTTPs. This is because MongoDB client handles the connection and data over TCP/IP so that we don't have to and so it needs to speak a common protocol within its system to handle everything, from creating a connection to converting data into a readable format.

The next in url is the address where the mongodb process is running. Here, 127.0.0.1 represents local installation and 27017 is the default port that mongod daemon (service) runs on. The path, in the URL passed to connect(...) above, is MusicDB -- the database we need to access. 

Once we have connected to our database, we would want to query the albums collection, and to do that we could use,

db.collection('albums').find({})

In the code above, we have a collection method that accepts a collection name and returns the collection interface in need. This interface is responsible for providing us a handful of methods for querying, updating, removing (and more) data in the collection. We are here querying the collection with find(...) method which is taking in a query-less object and thus will return all of the data entries in the collection as a cursor. We have applied toArray(...) to convert the cursor into an array of entries as albums.

Now, to check the connection, just put all of the Node.js code above together in a file named app.js, and run the following command to execute it in Node.js environment,

sudo node app.js

If you get the data entry inserted earlier, then everything went well, otherwise, there is an issue and you need to go back and check where you made the mistake. The most common mistakes involve incorrect address or port or even database name.

One more thing. MongoDB returns a promise when we connect to a database. So please make sure you close the database once the job with it is done to avoid a memory leak.

4. What's next?

Once you're done with the connection, there are a lot of things to do with Node.js and MongoDB together. Create APIs for CRUD operations, or save data with MongoDB for BigData operations (with MapReduce framework) or create a socket with Node.js and store user chat data with MongoDB or save users' hashed passwords with it or generate access tokens to verify API access or do a pipeline (a series of queries) with its aggregation framework for complex situations.

For Node.js, we also have the excellent Mongoose client (created by Automattic) that offers schema support among many other benefits. Mongoose takes care of collection structure for us by creating separate collection instances using Schema.You can read more about Mongoose here

We will cover Mongoose in detail in some other post. We will also be covering different MongoDB operations, performance tweaks and MapReduce feature for Big Data over the coming weeks. More specifically, the next article in the series will be on different ways to query (find) a MongoDB collection.

If you have any questions or issues, please be sure to drop a comment below. You can choose to stay connected with us by subscribing to our weekly newsletter. Also, if you liked this article, you can spread the love by sharing it with others.