MongoDB Aggregation $lookup
MongoDB $lookup
stage is a powerful feature within the Aggregation Framework that allows us to perform a left outer join between collections. This allows developers to combine related data from multiple collections within the same database which is highly useful for scenarios requiring relational-like queries in a NoSQL environment.
Understanding the MongoDB $lookup
syntax and usage are important for efficiently performing complex queries and retrieving related data in a MongoDB database. In this article, we will explain the structure of MongoDB's $lookup
stage, demonstrate its implementation with a practical example, and explain how it performs a left outer join to merge data from multiple collections.
MongoDB Aggregation $lookup
The $lookup operator in MongoDB allows us to perform a left outer join between documents from two collections, similar to SQL joins. This means that documents from the primary collection (input) will be matched with documents from the foreign collection based on a given condition. It's useful for combining data from different collections for analysis or reporting purposes. If no match is found, the result will still include the input document with an empty array, providing flexibility in data retrieval.
Why Use $lookup
?
- Combine data across multiple collections without creating redundant data.
- Eliminate the need for multiple queries by fetching related data in a single operation.
- Perform SQL-like joins while leveraging MongoDB's flexible document structure.
- Efficient reporting & analytics by aggregating data from different collections
Syntax:
{
$lookup: {
from: <foreignCollection>,
localField: <fieldInInputDocument>,
foreignField: <fieldInForeignDocument>,
as: <outputArrayField>
}
}
Key Terms
- from: The name of the foreign collection to join with.
- localField: The field from the input documents that will be used for matching.
- foreignField: The field from the foreign collection that will be used for matching.
- as: The name of the output array field that will contain the joined documents.
Examples of MongoDB Aggregation $lookup
To understand MongoDB Aggregation $lookup we need a collection and some documents on which we will perform various operations and queries. Here we will consider a collection called orders and customers which contains various information shown as below:
Collection: orders
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"order_number": "ORD001",
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"customer_details": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
}
]
},
// Other order documents...
]
Collection: customers
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
},
// Other customer documents...
]
Example 1: Perform a Single Equality Join with $lookup
Let's Retrieve orders from the orders
collection along with corresponding customer details from the customers
collection based on matching customer_id
and _id
.
Query:
db.orders.aggregate([
{
$lookup: {
from: "customers",
localField: "customer_id",
foreignField: "_id",
as: "customer_details"
}
}
])
Output:
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"order_number": "ORD001",
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"customer_details": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
}
]
},
// Other order documents with appended customer details...
]
Explanation:
This query uses the $lookup
stage to perform a left outer join between the orders
collection and the customers
collection. It matches customer_id
from the orders
collection with _id
in the customers
collection and appends the matched customer details to an array field named customer_details
. If no match is found, the customer_details
array will be empty.
Example 2: Use $lookup with an Array
By default, the $lookup operator includes an array field (as) in the output documents, even if no matches are found in the foreign collection. This array field will be empty ([]) for unmatched documents.
Continuing from the previous example, suppose there are orders with customer_id values that do not exist in the customers collection. The $lookup operator will still include these orders in the output, with an empty customer_details array for unmatched documents.
Query:
db.orders.aggregate([
{
$lookup: {
from: "customers",
let: { customerId: "$customer_id" },
pipeline: [
{
$match: {
$expr: { $in: ["$_id", "$$customerId"] }
}
}
],
as: "customer_details"
}
}
])
Output:
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"order_number": "ORD001",
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"customer_details": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
}
]
},
// Other order documents...
]
Explanation:
- In this example,
$lookup
is used with an array and a pipeline ($match
and$expr
) to joinorders
withcustomers
. - It matches
customer_id
fromorders
with_id
incustomers
, appending customer details intocustomer_details
.
Example 3: Use $lookup with $mergeObjects
We can include multiple $lookup stages in an aggregation pipeline to perform multiple join operations with different foreign collections. Suppose we want to enhance orders documents with details from both customers and products collections. We can include multiple $lookup stages to achieve this:
Query:
db.orders.aggregate([
{
$lookup: {
from: "customers",
localField: "customer_id",
foreignField: "_id",
as: "customer_info"
}
},
{
$addFields: {
customer_details: {
$mergeObjects: "$customer_info"
}
}
}
])
Output:
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"order_number": "ORD001",
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"customer_info": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
}
],
"customer_details": {
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com"
}
},
// Other order documents...
]
Explanation:
- Here,
$lookup
is followed by$mergeObjects
to combine fields fromcustomers
with fields inorders
. - It merges matched customer details into a single
customer_details
object within each order document.
Example 4: Perform Multiple Joins and a Correlated Subquery with $lookup
When working with multiple related collections in MongoDB, we may need to join data from multiple sources. The $lookup
stage allows us to fetch data from different collections in a single aggregation pipeline, improving efficiency and query performance.
Query:
db.orders.aggregate([
{
$lookup: {
from: "customers",
let: { customerId: "$customer_id" },
pipeline: [
{
$match: {
$expr: { $in: ["$_id", "$$customerId"] }
}
},
{
$lookup: {
from: "products",
localField: "_id",
foreignField: "customer_id",
as: "products_ordered"
}
}
],
as: "customer_details"
}
}
])
Output:
[
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"order_number": "ORD001",
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"customer_details": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"name": "John Doe",
"email": "john@example.com",
"products_ordered": [
{
"_id": ObjectId("60f9d7ac345b7c9df348a870"),
"customer_id": ObjectId("60f9d7ac345b7c9df348a86d"),
"product_name": "Product A"
}
]
}
]
},
// Other order documents...
]
Explanation:
- This complex query uses multiple
$lookup
stages within a pipeline. It first matchescustomer_id
fromorders
with_id
incustomers
, then performs a subquery$lookup
to match customer_id
withcustomer_id
inproducts
. - It enhances order documents with nested arrays of products ordered by each customer.
Conclusion
MongoDB’s $lookup
stage offers an effective way to communicate relational joins within the NoSQL ecosystem and crteate the gap between MongoDB’s document-based structure and the need for combined data from multiple collections. By learning the MongoDB $lookup
syntax and usage the developers can execute complex queries, retrieve related data more efficiently and enhance the functionality of their applications. Whether it's for performing a simple join or handling more intricate queries, understanding MongoDB $lookup
is a key component in leveraging MongoDB’s full potential.