GridFs is a MongoDb specification that enables us to store data that has size more than 16MB. These large files can be some audio files or large images or might
be a video file. In a way it is a kind of file system storage but still the data is store in MongoDb collections.
The GridFS storage spec is mainly used for working with files that exceed the BSON-document size limit of 16MB. GridFS is a simple file system abstraction on top of MongoDB. If you're familiar with Amazon S3, GridFS is a very similar abstraction. Now, why does a document-oriented database like MongoDB provide a file layer abstraction? Turns out there are some very good reasons:
Storing user-generated file content
A large number of web applications allow users to upload files. Historically, when working with relational databases, these user-generated files get
stored on the file system separate from the database. This creates a number of problems. How to replicate the files to all of the needed servers? How
to delete all the copies when the file is deleted? How to backup the files for safety and disaster recovery?
GridFS solves these problems for the user by storing the files along with the database, and you can leverage your database backup to backup your files. Also, due to MongoDB replication, a copy of your files is stored in each replica. Deleting the file is as easy as deleting an object in the database.
Accessing portions of file content
When a file is uploaded to GridFS, the file is split into chunks of 256k and stored separately. So, when you need to read only a certain range of bytes of the file, only those chunks are brought into memory and not the whole file. This is extremely useful when dealing with large media content that needs to be selectively read or edited.
Storing documents greater than 16MB in MongoDB
By default, MongoDB document size is capped at 16MB. So, if you have documents that are greater than 16MB, you can store them using GridFS.
Overcoming file system limitations
If you're storing a large number of files, you'll need to consider file system limitations like the maximum number of files/directory, etc. With
GridFS, you don't need to worry about the file system limits. Also, with GridFS and MongoDB sharding, you can distribute your files across different
servers without significantly increasing the operational complexity.
GridFs uses two collections to store the large data, one collection is used to store the file chunks and other one is to store its metadata. These two collections are known as fs.files and fs.chunks to store file metadata and chunks.
When we want to retrieve the large dataset from MongoDb server, the MongoDb drivers reassembles the data from chunks and returns the data to calling function.
Syntax to store file using GridFs is:
To find the document in the database, find() method is used, below is the syntax for the same: