MongoDB Aggregation $group Command
The $group
command in MongoDB's aggregation framework is a powerful tool for performing complex data analysis and summarization. It allows users to group documents based on specified keys and apply aggregate functions such as sum, count, average, min, max, and more.
In this article, we will explore MongoDB Aggregation $group
command in detail, covering its syntax, key functions, use cases, and examples to help beginners and professionals efficiently analyze their data
MongoDB Aggregation $group
The $group
command is an important aggregation pipeline stage that enables grouping of documents and applying aggregate functions on the grouped data. It is commonly used for data analysis, reporting, and summarization. Along with basic aggregate functions like sum, count, and average the $group
supports a variety of other operations such as finding the maximum or minimum value in a group, concatenating strings and calculating standard deviations.
Key Features of $group
Command
- Groups documents based on a specified field or expression
- Supports multiple aggregation operations such as
$sum
,$count
,$avg
,$max
, and$min
- Allows grouping by multiple fields for more detailed analysis
- Can be combined with other aggregation stages like
$match
,$sort
, and$project
- Helps in summarizing large datasets efficiently
Syntax:
The basic syntax of the $group command is as follows:
{
$group: {
_id: <expression>,
<field1>: { <accumulator1>: <expression1> },
<field2>: { <accumulator2>: <expression2> }
}
}
Key Terms
- $_id -> The field used to group documents. It can be an existing field or a computed expression.
- <field1>, <field2> -> Fields to include in the output.
- <accumulator1>, <accumulator> -> Aggregate functions to apply to grouped data.
- <expression>, <expression> -> Expressions to compute values for grouping or aggregation.
Examples of $group Command in MongoDB
The $group
command is widely used for aggregating and analyzing data in MongoDB. It helps in summarizing sales, counting occurrences, and computing statistics efficiently. To illustrate its usage, let's consider a sales
collection that stores sales transactions, where each document includes details such as product
, category
, and amount
. Below is a sample dataset:
Sample Data:
[
{
"product": "Product A",
"category": "Category 1",
"amount": 100
},
{
"product": "Product B",
"category": "Category 2",
"amount": 150
},
{
"product": "Product C",
"category": "Category 1",
"amount": 120
},
{
"product": "Product D",
"category": "Category 2",
"amount": 200
}
]
Example 1: Count the Number of Documents in a Collection
This query calculates the total number of documents present in the sales
collection, providing a quick way to determine the dataset size.
Query:
db.sales.aggregate([
{
$group: {
_id: null,
count: { $sum: 1 }
}
}
])
Output:
[
{
"_id": null,
"count": 4
}
]
Explanation:
_id: null
→ Groups all documents together without a specific field.$sum: 1
→ Adds 1 for each document, effectively counting the total number of documents.- The result shows that there are 4 documents in the
sales
collection
Example 2. Retrieve Distinct Values
This query retrieves unique category values from the sales
collection, helping identify different product categories available in the dataset.
Query:
db.sales.aggregate([
{
$group: {
_id: "$category"
}
}
])
Output:
[
{ "_id": "Category 1" },
{ "_id": "Category 2" }
]
Explanation:
_id: "$category"
→ Groups documents by thecategory
field, effectively extracting distinct category values.- The result lists the unique categories present in the
sales
collection, which are"Category 1"
and"Category 2"
. - This approach is useful for filtering unique values in large datasets efficiently.
Example 3: Group by Item Having
This query groups documents by category and calculates the total sales amount for each category in the sales
collection
Query:
db.sales.aggregate([
{
$group: {
_id: "$category",
totalAmount: { $sum: "$amount" }
}
}
])
Output:
[
{ "_id": "Category 1", "totalAmount": 220 },
{ "_id": "Category 2", "totalAmount": 350 }
]
Explanation:
_id: "$category"
→ Groups documents by thecategory
field.$sum: "$amount"
→ Adds up theamount
values for each category.- The result shows that Category 1 has a total sales amount of 220, while Category 2 has 350.
- This query is useful for financial analysis, revenue tracking, and sales reporting
Example 4: Calculate Count, Sum, and Average
This query groups documents by category and calculates the total count of documents, sum of sales amount, and average sales amount per category in the sales
collection.
Query:
db.sales.aggregate([
{
$group: {
_id: "$category",
count: { $sum: 1 },
totalAmount: { $sum: "$amount" },
averageAmount: { $avg: "$amount" }
}
}
])
Output:
[
{
"_id": "Category 1",
"count": 2,
"totalAmount": 220,
"averageAmount": 110
},
{
"_id": "Category 2",
"count": 2,
"totalAmount": 350,
"averageAmount": 175
}
]
Explanation:
_id: "$category"
→ Groups documents by category.$sum: 1
→ Counts the number of documents in each category.$sum: "$amount"
→ Computes the total sales amount per category.$avg: "$amount"
→ Calculates the average sales amount per category.- The result shows that Category 1 has 2 transactions, with a total amount of 220 and an average amount of 110, while Category 2 has 2 transactions, with a total amount of 350 and an average amount of 175
Exampl 5: Group by null
This query calculates the total sum of the amount
field across all documents in the sales
collection, without grouping by any specific field.
Query:
db.sales.aggregate([
{
$group: {
_id: null,
totalAmount: { $sum: "$amount" }
}
}
])
Output:
[
{ "_id": null, "totalAmount": 570 }
]
Explanation:
_id: null
→ Groups all documents together as a single group, meaning the entire collection is aggregated.$sum: "$amount"
→ Computes the total sum of theamount
field across all documents.- The output shows that the total sales amount in the collection is 570.
Best Practices for Using $group
in MongoDB
1. Use Indexing for Better Performance – Index fields used in grouping to speed up queries.
2. Optimize Aggregation Pipelines – Apply $match
before $group
to filter unnecessary documents.
3. Avoid Grouping on Large Fields – Avoid using large string fields for _id
to prevent memory overload.
4. Combine $group
with $sort
and $project
– Use $sort
for ordering results and $project
for refining output.
Conclusion
Overall, The $group
command in MongoDB's aggregation framework allow users to perform complex data manipulations and analytics efficiently. By using its capabilities, developers and data analysts can derive actionable insights from diverse datasets, enhancing decision-making processes and operational efficiencies. By mastering the $group
command, we can enhance our MongoDB data processing skills and build efficient data-driven applications.
FAQs on MongoDB Aggregation $group
What key functions can be performed using the $group command in MongoDB?
The
$group
command supports various aggregate functions such as$sum
,$avg
,$max
,$min
, and$first
, allowing for comprehensive data aggregation based on specified grouping criteria.
How does $group
differ from other aggregation stages like $match
and $sort
?
While
$match
filters documents based on specified conditions and$sort
orders documents,$group
focuses on grouping documents and performing aggregate calculations within those groups, enabling deeper data analysis.
Can multiple fields be used for grouping with the $group
command?
Yes, MongoDB allows grouping by multiple fields simultaneously using the
$group
command, facilitating detailed and multi-dimensional data analysis as per specific business requirements.