MongoDB Indexes: Performance Optimizer for MongoDB
A database index is a way to make certain queries faster at the expense of making updates somewhat slower.
A database index is a way to make certain queries faster at the expense of making updates somewhat slower. For example, suppose you have a Product
model that has 10,000 documents. Without an index, MongoDB needs to scan all 10,000 documents every time you want a single product
The term collection scan means a query where MongoDB needs to iterate through the entire collection. Whether a collection scan is a problem depends on the size of the collection and your use case. For example, if you create 100,000 product documents, a collection scan can take over 40-45 milliseconds depending on your hardware specs and how much load there is on the database.
Depending on your app, this may not be a problem. But if querying (searching ) is a core feature for your app, you need to avoid collection scans.
Rule of Thumb: If your end-users expect to be able to access the results of a query nearly instantaneously (40–45ms), your query should avoid performing a collection scan on models with more than 100,000 documents. If you have less than 10,000 documents, collection scans should not be a problem.
So how do you avoid a collection scan? The answer is to build an index on the most searched property like product name product category
.
Defining Indexes
Most apps define indexes in their Mongoose schemas. You can define indexes in your schema definition using index: true
:
// `schema` has 2 indexes: one on `name`, and one on `email`.
const schema = new Schema({
productName: { type: String, index: true },
productCategory: { type: String, index: true }
});
You can also define an index by using the Schema#index()
function:
const schema = new Schema({ productName: String, productCategory: String });
// Add 2 separate indexes to `schema`
schema.index({ productName: 1 });
schema.index({ productCategory: 1 });
Building Indexes
You can define indexes in your schema, but indexes live on the MongoDB server. In order to actually use an index, you need to build the index on the MongoDB server.
Mongoose automatically builds all indexes defined in your schema when you create a model:
const schema = new Schema({ productName: String, productCategory: String });
schema.index({ productName: 1 });
schema.index({ productCategory: 1 });
// If the indexes already exist, Mongoose doesn't do anything.
const Model = mongoose.model('Test', schema);
When you create a new model, Mongoose automatically calls that model’s createIndexes()
function. You can disable automatic index builds using the autoIndex
schema option:
const opts = { autoIndex: false }; // Disable auto index build
const schema = Schema({ productName: { type: String, index: true } }, opts);
const Model = mongoose.model('Test', schema);
await Model.init();
// Does **not** have the index on `name`
const indexes = await Model.listIndexes();
You can also call createIndexes()
yourself to build all the schema's indexes.
const opts = { autoIndex: false };
const schema = Schema({ productName: { type: String, index: true } }, opts);
const Model = mongoose.model('Test', schema);await Model.init();
let indexes = await Model.listIndexes();
indexes.length; // 1await Model.createIndexes();
indexes = await Model.listIndexes();
indexes.length; // 2
Once an index is built, it remains on the MongoDB server forever, unless someone explicitly drops the index. Dropping a database or dropping a collection also drops all indexes. That means once your indexes are built, you usually don’t have to worry about them again.
Although Mongoose automatically builds all indexes defined in your schema, Mongoose does not drop any existing indexes that aren’t in your schema. You can use the syncIndexes()
function to ensure that the indexes in MongoDB line up with the indexes in your schema:
let schema = Schema({ productName: { type: String, index: true } });
let Model = mongoose.model('Test', schema);
await Model.init();// Now suppose you change the property `productName` to `productTitle`.
// By default, Mongoose won't drop the `productName` index
schema = Schema({ productTitle: { type: String, index: true } });
mongoose.deleteModel('Test');
Model = mongoose.model('Test', schema);
await Model.init();// There are now 3 indexes in the database: one on `_id`, one
// on `productTitle`, and one on `productName`.
let indexes = await Model.listIndexes();
indexes.length; // 3
Compound Indexes
A compound index is an index on multiple properties. There are two ways to define indexes on multiple properties. First, using Schema#index()
:
let schema = Schema({ firstName: String, lastName: String });// Define a compound index on { firstName, lastName }
schema.index({ firstName: 1, lastName: 1 });const Model = mongoose.model('Test', schema);
await Model.init();const indexes = await Model.listIndexes();
indexes.length; // 2
indexes[1].key; // { firstName: 1, lastName: 1 }
The alternative is to define 2 SchemaType
paths in your schema with the same index name
. Mongoose groups indexes with the same name into a single compound index.
let schema = Schema({
firstName: {
type: String,
index: { name: 'firstNameLastName' }
},
lastName: {
type: String,
index: { name: 'firstNameLastName' }
}
});const indexes = schema.indexes();
indexes.length; // 1
indexes[0][0]; // { firstName: 1, lastName: 1 }
Why are compound indexes useful? Indexes aren’t useful unless they’re specific enough. but the intuition is that you want to minimize the number of documents that MongoDB needs to look through to answer your query.
The below example shows that indexes don’t just magically make your queries fast. In degenerate cases, like below where every document has the same firstName
, having a bad index can be worse than having no index at all.
let schema = Schema({ firstName: String, lastName: String });
// Querying by { firstName, lastName } will be slow, because
// there's only an index on `firstName` and every document
// has the same `firstName`.
schema.index({ firstName: 1 });const User = mongoose.model('User', schema);
const docs = [];
for (let i = 0; i < 100000; ++i) {
docs.push({ firstName: 'Agent', lastName: 'Smith' });
}
docs.push({ firstName: 'Agent', lastName: 'Brown' });
await User.insertMany(docs);const start = Date.now();
let res = await User.find({ firstName: 'Agent', lastName: 'Brown' });
const elapsed = Date.now() - start;// Approximately 315 on my laptop, 3x slower than if no index!
elapsed;
On the other hand, if you build a compound index on { firstName, lastName }
, MongoDB can find the 'Agent Brown' document almost instantaneously.
let schema = Schema({ firstName: String, lastName: String });
schema.index({ firstName: 1, lastName: 1 });const User = mongoose.model('User', schema);const start = Date.now();
let res = await User.find({ firstName: 'Agent', lastName: 'Brown' });
const elapsed = Date.now() - start;elapsed; // Approximately 10 on my laptop
Unique Indexes
A unique
index means that MongoDB will throw an error if there are multiple documents with the same value for the indexed property. One benefit of unique indexes is that bad index specificity is impossible.
There are several ways to declare a unique
index. First, you can set unique: true
on a property in your schema definition:
let schema = Schema({
email: {
type: String,
unique: true
}
});
const User = mongoose.model('User', schema);
await User.init();// Unique index means MongoDB throws an 'E11000 duplicate key
// error' if there are two documents with the same `email`.
const err = await User.create([
{ email: 'agent.smith@source.com' },
{ email: 'agent.smith@source.com' }
]).catch(err => err);err.message; // 'E11000 duplicate key error...'
You can also set the unique
option to true
when calling Schema.index()
. This lets you define a compound unique index:
let schema = Schema({ firstName: String, lastName: String });
// A compound unique index on { firstName, lastName }
schema.index({ firstName: 1, lastName: 1 }, { unique: true });const indexes = schema.indexes();
indexes.length; // 1
indexes[0][0]; // { firstName: 1, lastName: 1 }
indexes[0][1].unique; // true
An important note about compound unique indexes: in the above example, there may be a duplicate firstName
and lastName
, but the combination of firstName
and lastName
must be unique.
For example, there can only be at most one document that matches { firstName: 'Agent', lastName: 'Smith' }
. But there can be many documents with firstName
'Agent' or lastName
'Smith'.
The _id
Index
MongoDB automatically creates an index _id
whenever it creates a new collection. This index is unique
under the hood. Unfortunately, the listIndexes()
function doesn't report the _id
index as unique: this is a known quirk with MongoDB.
const User = mongoose.model('User', Schema({ name: String }));
await User.createCollection();// MongoDB always creates an index on `_id`. Even though
// `listIndexes()` doesn't say that the `_id` index is unique,
// the `_id` index **is** a unique index.
const indexes = await User.listIndexes();
indexes.length; // 1
indexes[0].key; // { _id: 1 }
indexes[0].unique; // undefined// Try to create 2 users with the exact same `_id`
const _id = new mongoose.Types.ObjectId();
const users = [{ _id }, { _id }];
const err = await User.create(users).catch(err => err);err.message; // 'E11000 duplicate key error...'
The _id
index ensures that Model.findOne({ _id })
and Model.findById(id)
are almost always fast queries.