ITP249 - Lecture - 13 - v2-2

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

MONGODB

Aggregation Framework

ITP 249
Topics
• MongoDB Aggregation Framework
– Match
– Project
– Group
– Match
– Sort
– Limit
– And many more
Recall: SELECT
SELECT … *
FROM …
WHERE …
GROUP BY…
HAVING …
ORDER BY …
LIMIT …
Conversion to MongoDB
SELECT CLAUSE MONGODB
WHERE Match

SELECT Project

GROUP BY Group

HAVING Match

ORDER BY Sort

TOP or LIMIT Limit


Aggregation Framework
• In Mongo, the aggregation framework is
sequential
• Which means, each of the functions is
executed in a sequential order on a
collection
• The results of the previous function are
used for the next function
• Each step in the sequence is called stage
Code
db.collectionname.aggregate()
Followed by stages function (e.g):
• Match
• Project
• Group
• Match
• Sort
• Limit
• Count
Studio3T Aggregate Pipeline
• A stage by stage methodology to create an
aggregate pipeline
• Sample
– Use the Yelp Business dataset
– Go to Aggregate
– Click Run to see the raw data
Adding a filter
• Add a new stage. Change the Option to
Match
• We will filter businesses in AZ.
• Add the filter { state: “AZ”}. Then execute.
• You can also click on Output of Selected
Stage to see the output of the current
stage.
• You can also click on Input of Selected
Stage
Adding a group
• Add a new stage. Change Option to group
• Let’s count the number of businesses per
postal_code
• Add this to the group stage
{
_id: "$postal_code",
numberofbiz: {$sum: 1}
}
Group syntax
{ $group:
{ _id: <expression>,
<field1>: { <accumulator1> :
<expression1> }, ... }
}
Sort the results
• Add a new stage, Option sort
• Sort by numberofbiz in descending order
• Add the following
{
numberofbiz: -1
}
Query
• Right click the aggregate pipeline and preview
db.yelp_business.aggregate(

// Pipeline
[
// Stage 1
{
$match: {
state: "AZ"
}
},

// Stage 2
{
$group: {
_id: "$postal_code",
numberofbiz: {$sum: 1}
}
},

// Stage 3
{
$sort: {
numberofbiz: -1
}
},
],
On your own - #1
• Limit the results to the top 5 postal codes
• Only show postal codes with more than
1000 businesses
On your own - #2
• Show the business with the most
categories
– $project (field1:1, field2:1….
– $size
Summary
• We explored aggregation using Mongo
db.restarurants.aggregate({$project: {name: 1,
totalscore: {$add:
["$field1",
"$field2",
"$field3",
"$field4",
"$field4”
]}}},
{$sort:{totalscore:-1}},
{$limit:1})

You might also like