'Using `$unwind` in MongoDB doubts of different `$sum` outputs
I´ve a doubt using $unwind...
First of all, this is my collection "Provincias.json" that stores info about Spanish regions (short example):
{
_id: ObjectId("62095151245a59e4bda99e9a"),
Nombre: 'Soria',
CA: 'Castilla y León',
Superficie: 10306,
Datos: [
{ Anyo: 2015, Valor: 91780 },
{ Anyo: 2014, Valor: 92630 },
{ Anyo: 2013, Valor: 93575 },
{ Anyo: 2012, Valor: 94463 },
..... ]
Where CA is the state name, Nombre the county name, Superficie county surface and Datos the population each year.
Well I´m trying to make a "query" that returns the info grouped by CA with total surface and the difference between max and min population densities.
My expression is as follows:
db.ProvinciasPEC1.aggregate([
{$unwind: "$Datos"},
{$group: {
"_id": "$CA",
"totSuperficie": {$sum:"$Superficie"},
"PoblacionMaxima": {$max: "$Datos.Valor"},
"PoblacionMinima": {$min: "$Datos.Valor"}}},
{$project: {
"_id": 1,"totSuperficie": 1,
"totPoblacion": 1,
"diffDensidad": {
$divide: [
{$subtract: ["$PoblacionMaxima", "$PoblacionMinima"]} , "$totSuperficie"
]
}
}
},
{$sort: {"diffDensidad":-1}},
{$out:"CAM_Densidad"}
])
I returns me somethin like this:

But I don´t know why the total Surface is wrong.
Using the following expression I get States with proper total surface:
db.ProvinciasPEC1.aggregate([{$group: {"_id": "$CA", "totSuperficie": {$sum:"$Superficie"}}},{$out:"CAM_Densidad"}])
That returns:

Please, any hint to figure out what I´m doing wrong? I think is related to $unwind, but not sure,
Thanks a lot!
Solution 1:[1]
Maybe something like this:
db.collection.aggregate([
{
$unwind: "$Datos"
},
{
$group: {
"_id": "$CA",
"totSuperficie": {
$first: "$Superficie"
},
"PoblacionMaxima": {
$max: "$Datos.Valor"
},
"PoblacionMinima": {
$min: "$Datos.Valor"
}
}
},
{
$project: {
"_id": 1,
"totSuperficie": 1,
"totPoblacion": 1,
"diffDensidad": {
$divide: [
{
$subtract: [
"$PoblacionMaxima",
"$PoblacionMinima"
]
},
"$totSuperficie"
]
}
}
},
{
$sort: {
"diffDensidad": -1
}
},
{
$out: "CAM_Densidad"
}
])
explained:
- Unwind the Datos
- Group the per CA (but preserve the Superficie since it is the same value for all unwinded Datos so if you sum them you will get different sum depending on number of Datos per CA )
- Project
- Sort by diff
- Out to the output collection
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
