Java Login

About javalogin.com

Hello guys,
javalogin.com is for Java and J2EE developers, all examples are simple and easy to understand 

It is developed and maintained by Vaibhav Sharma. The views expressed on this website are his own and do not necessarily reflect the views of his former, current or future employers. I am professional Web development. I work for an IT company as Senior Consultant. Primary I write about spring, hibernate and web-services. I am trying to present here new technologies.


     << Previous
Next >>     


MongoDB Indexes


Index is a typical way to speed-up queries in normal database system. There is no difference between MongoDB and a document-based database system. MongoDB lets users specify between indexes and that prevents applications from duplicating values into inserted fields and a number of other issues that can possibly arise. However, MongoDB cannot make a unique index on specified index fiends if your collection already has data which would violate the unique index constraints. This article gives insight about the index in MongoDB, for query optimization.


Index in MongoDB:


Default Indexes
_id is an ObjectId object, 12-byte BSON type that guarantees uniqueness within the collection. The ObjectId is generated based on timestamp, machine ID, process ID, and a process-local incremental counter.
Single Field Indexes For a single-field index and sort operations, the sort order (i.e. ascending or descending) of the index key does not matter because MongoDB can traverse the index in either direction. The value of index is the type of index. For example, 1 indicates ascending order and -1 specifies the descending order.


>db.employees.createIndex( { "name" : 1 } )


For example, just say you're creating a unique index called "name-id" under the accounts collection so you avoid future multiples or duplicates per entity. Implement the following:


> db.employees.createIndex( { "name" : 1 }, { unique: true } )


Sparse indexes Indexes
Sparse indexes are able to skip missed information instead of registering it as null, and unique indexes aren't able to carry duplicate values for fields. Check out the following code to remedy the issue:


> db.employees.createIndex( { "name" : 1 }, { unique: true, sparse: true } )


Compound Field Indexes
The order of fields listed in a compound index has significance. For instance, if a compound index consists of { userid: 1, score: -1 }, the index sorts first by userid and then, within each userid value, sorts by score.


> db.products.createIndex( { "item": 1, "stock": 1 }, { unique: true } )


MongoDB uses multiple index to index the content in an array. MongoDB creates separate index entries for every element of the array. You do not need explicitly create multiple key.
Text Indexes
text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed field for each document inserted.
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique post-stemmed word in each indexed field of each new source document.


> db.reviews.createIndex( { comments: "text" } )


Hash index
Query content by its hashed value. The hash is a function to computed by its value. The hashed value is designed to be distinct value. The one advantage is it is so quick, which take O(1) at most but by contract the normal binary search tree will take O(Log(N)). Hash will be theoretically quicker than normal binary search tree implementation. But the disadvantage is hash index performing range search will be extremely slowly than normal index.
Using explain
explain is an incredibly handy tool that will give you lots of information about your queries. You can run it on any query by tacking it on to a cursor. explain returns a document, not the cursor itself, unlike most cursor methods:


> db.foo.find().explain()


explain will return information about the indexes used for the query (if any) and stats about timing and the number of documents scanned.
For a very simple query ({}) on a database with no indexes (other than the index on "_id") and 64 documents, the output for explain looks like this:


> db.people.find().explain() { "cursor" : "BasicCursor",
"indexBounds" : [ ],
"nscanned" : 64,
"nscannedObjects" : 64,
"n" : 64,
"millis" : 0,
"allPlans" :
[
{
"cursor" : "BasicCursor",
"indexBounds" : [ ]
}
]
}


The important parts of this result are as follows:
"cursor" : "BasicCursor"
This means that the query did not use an index (unsurprisingly, because there was no query criteria). We'll see what this value looks like for an indexed query in a moment.
"nscanned" : 64
This is the number of documents that the database looked through. You want to make sure this is as close to the number returned as possible.
"n" : 64
This is the number of documents returned. We're doing pretty well here, because the number of documents scanned exactly matches the number returned. Of course, given that we're returning the entire collection, it would be difficult to do otherwise.
"millis" : 0
The number of milliseconds it took the database to execute the query. 0 is a good time to shoot for.

Taking into account all these cases I prepared some recommendation how to improve MonogDB performance if you hesitating with RAM vs. I/O dilemma:

  • From the hardware point. Both RAM and I/O increasing will cause positive effect on system but it difficult to say what will get better increasing - I/O or RAM. It very depends on MongoDB usage pattern. But if have a chance to choosing between a couple more Gigabytes of RAM and SSD - SSD will be more effective in most of the cases.
  • Multiple servers. In the question of spreading data across multiple servers it will be better to have small and middle sized collection on the same server (not sharded across multiple servers), and only big collections sharded between multiple nodes. It reduces indexes per server number and will reduce resources completion between servers.
  • Better to have nothing except MongoDB on the database server to avoid additional concurrency for resources. Verify index model and make sure that it covers the most frequently used cases.

     << Previous
Next >>