Storing Images in MongoDB using GridFS

Posted by: Mahesh Sabnis , on 2/21/2018, in Category Node.js
Views: 6512
Abstract: MongoDB GridFS is a good specification for storing large files in MongoDB. It makes sure that the file is divided into chunks and stored into a database. This article explains the mechanism of storing and retrieving binary files to and from MongoDB.

This article explains the mechanism of storing and retrieving binary files to and from MongoDB. To perform this operation, MongoDB uses the GridFS specification for storing and retrieving large binary files e.g. images, videos, audio files etc.

GridFS is a kind of file system that is used to store files within MongoDB collections.

GridFS - Introduction

One may think how are files stored in GridFS? Instead of storing a file in a single document, GridFS divides the file into parts a.k.a. chunks.

Each chunk is stored in a separate document. Each chunk has a maximum size of 255 KB, except for the last chunk. The last chunk is only as large as required. If the file size is less than the chunk size, then there will be only chunk of that size.

GridFS uses two collections to store files.

These collections are fs.files and fs.chunks, which are used to store file metadata and chunks respectively. fs.chunks stores the binary chunks. fs.files stores file’s metadata.

fs.files contains the _id and filename fields in the collection that represents file stored in the collections. The _id value from the fs.files collection is used in the fs.chunks collection and set to the files_id field. Using this mechanism the fs.files and fs.chunks collections are linked together.

GridFS can be used to store files larger than 16 MB and provides the following benefits:

  • GridFS can be used to store as many files as needed.
  • Since these files are stored in collections, information of a specific part of the file can be accessed without loading the file into memory.

As described above, the real advantage of this approach is that only a portion of the file can be read without loading the entire file into the memory.

Creating an App using GridFS, MongoDB, Node.js and VSCode

To implement the application described in this article, the development machine must have the following software installed:

1. Node.js, this can be installed from https://www.nodejs.org

2. Visual Studio Code (VSCode) https://code.visualstudio.com

3. MongoDB with MongoDB Compass https://www.mongodb.com

Step 1: Create a folder of the name mongodb_gridfs. Open this folder in the VSCode editor. This folder will become a workspace containing all the code files in it.

Step 2: In this workspace, add folders named filestoread and filestowrite containing files which will be read and stored into a database, and to store files read from the database respectively.

Step 3: Open the terminal window of the VSCode using View > Integrated terminal option or Ctrl+` shortcut key. Run the following command:

npm init -y

This command will add the package.json file in the workspace with some default sections. Modify the package.json file by defining the following packages in the devDependencies section :

"devDependencies": {
  "gridfs-stream": "^1.1.1",
  "mongoose": "^4.13.6"
}

The gridfs-stream package is used to stream files easily to and from MongoDB GridFS. The mongoose package is the MongoDB object modeling tool designed to work in an asynchronous environment for performing operations with the MongoDB database.

Step 4: The following figure shows the project folder structure.

mongodb-gridfs-folder-structure

Figure 1: The project structure

Add a couple of images/videos/audios files in the filestoread folder. These files will be used for performing write and read operations. In the current example, a sample bird.png file is used.

Step 4: Once the Node.js, Mongo and the MongoDB Compass is downloaded and installed, open the Node.js command prompt and navigate to the MongoDB installation folder. Run the following command from MongoDB path to start the MongoDB instance

mongod -dbpath d:\MongoDBData\data

Note: On my dev machine, the MongoDBData folder is created with a ‘data’ sub-folder in it.

Open the MongoDB Compass and connect to the MongoDB Database as shown in Figure 2.

mongodb-compass

Figure 2: The MongoDB Compass

The MongoDB Compass provides an option to create the Database and a collection in it. Create a database of the name filesDB and a new collection named files in it.

Writing a file using GridFS

Step 6: In the workspace folder, add a new JavaScript file and name it as writefile.js. Add the following code in this file:

//1. Load the mongoose driver
var mongooseDrv = require("mongoose");
//2. Connect to MongoDB and its database
mongooseDrv.connect('mongodb://localhost/filesDB', { useMongoClient: true });
//3. The Connection Object
var connection = mongooseDrv.connection;
if (connection !== "undefined") {
    console.log(connection.readyState.toString());
    //4. The Path object
    var path = require("path");
    //5. The grid-stream
    var grid = require("gridfs-stream");
    //6. The File-System module
    var fs = require("fs");
    //7.Read the video/image file from the videoread folder
    var filesrc = path.join(__dirname, "./filestoread/bird.png");
    //8. Establish connection between Mongo and GridFS
    Grid.mongo = mongooseDrv.mongo;
    //9.Open the connection and write file
    connection.once("open", () => {
        console.log("Connection Open");
        var gridfs = grid(connection.db);
        if (gridfs) {
            //9a. create a stream, this will be
            //used to store file in database
            var streamwrite = gridfs.createWriteStream({
                //the file will be stored with the name
                filename: "bird.png"
            });
            //9b. create a readstream to read the file
            //from the filestored folder
            //and pipe into the database
            fs.createReadStream(filesrc).pipe(streamwrite);
            //9c. Complete the write operation
            streamwrite.on("close", function (file) {
                console.log("Write written successfully in database");
            });
        } else {
            console.log("Sorry No Grid FS Object");
        }
    });
} else {

    console.log('Sorry not connected');
}
console.log("done");

The above code has the following specifications. (Note: Following file numbers matches with comments applied on the above code.)

1. Load the mongoose driver. This will be used for connecting with the MongoDB database.

2. Connect to the MongoDB filesDB database using the connect() function of the mongoose driver.

3. Get the connect object. If the connection object is not undefined, then steps 4-9 will be executed.

4. Load the path module. This will be used for reading file from the folder in the workspace.

5. Load the gridfs-stream module. This module will be used to easily stream files to and from MongoDB GridFS.

6. Load the fs module. This is the file system module. This will be used for creating read-stream based on the file read using path module.

7. Read the binary file (video/image) from the folder in the workspace using path module.

8. Using Grid.mongo = mongooseDrv.mongo; the connection between the MongoDB and GridFS will be established.

9. This step will be used to open connection with MongoDB so that a file can be written into it. The line var gridfs = Grid(connection.db); accesses the GridFS object over a MongoDB connection.

a. If the GrdiFS object is available, create the GridFS write stream so that it can be written in the database.

b. In this step, the file from the filestoread folder is passed as a parameter to the createReadStream() function of the fs module. The pipe() function accepts the write-stream created using the gridfs object. This stream is created for the image file.

c. The file is written in the MongoDB GridFS when the close event of the streamwrite object is fired.

To run the application, run the following command from the terminal window

node writefile

The following result will be displayed:

writefile-res

Figure 3: Command to run the writefile code

Now revisit the MongoDB Compass and the data in filesDB will be displayed as shown in Figure 4.

filesdb-collection

Figure 4: The fs.chunks and the fs.files collections in the database containing image documents.

The fs.chunks and fs.files collection shows one document each. Documents in these collections can be displayed by clicking on each collection.

Files can be viewed from the fs.files as shown in Figure 5.

filesdb-fs-chunks

Figure 5: The Structure of the image file metadata

Likewise file chunks can be seen from the fs.chunks collection as shown in Figure 6.

filesdb-fs-chunks1

Figure 6: The Structure of the image chunk

Reading the file from MongoDB GridFS

To read the file, use the createReadStream() function of the gridfs object. The following code shows the code for reading file from the MongoDB GridFS

var mongooseDrv = require("mongoose");
var schema = mongooseDrv.Schema;
mongooseDrv.connect('mongodb://localhost/filesDB', { useMongoClient: true });
var connection = mongooseDrv.connection;

if (connection !== "undefined") {
    console.log(connection.readyState.toString());
    var path = require("path");
    var grid = require("gridfs-stream");
    var fs = require("fs");
    var videosrc = path.join(__dirname, "./filestowrite/celibration_write.mp4");
    Grid.mongo = mongooseDrv.mongo;
    connection.once("open", () => {
        console.log("Connection Open");
        var gridfs = grid(connection.db);
        if (gridfs) {
            var fsstreamwrite = fs.createWriteStream(
                path.join(__dirname, "./filestowrite/bird.png")
            );

            var readstream = gridfs.createReadStream({
                filename: "bird.png"
            });
            readstream.pipe(fsstreamwrite);
            readstream.on("close", function (file) {
                console.log("File Read successfully from database");
            });
        } else {
            console.log("Sorry No Grid FS Object");
        }
    });
} else {

    console.log('Sorry not connected');
}
console.log("done");

Run the following command from the terminal command prompt

node read 

This will read the file from the MongoDB GridFS and write the file to the filestowrite folder as shown Figure 7.

file-read

Figure 7: Image file is written in the filestowrite folder

 

Conclusion

MongoDB GridFS is a good specification for storing large files in MongoDB. It makes sure that the file is divided into chunks and stored into a database. The real advantage of this approach is that only a portion of the file can be read without loading the entire file into the memory.

This article was technically reviewed by Ravi Kiran.

Was this article worth reading? Share it with fellow developers too. Thanks!
Share on LinkedIn
Share on Google+
Further Reading - Articles You May Like!
Author
Mahesh Sabnis is a DotNetCurry author and Microsoft MVP having over 17 years of experience in IT education and development. He is a Microsoft Certified Trainer (MCT) since 2005 and has conducted various Corporate Training programs for .NET Technologies (all versions). Follow him on twitter @maheshdotnet


Page copy protected against web site content infringement 	by Copyscape




Feedback - Leave us some adulation, criticism and everything in between!

Categories

JOIN OUR COMMUNITY

POPULAR ARTICLES

FREE .NET MAGAZINES

Free DNC .NET Magazine

Tags

JQUERY COOKBOOK

jQuery CookBook