'Grouping over multiple items in List<Dictionary<string, string>>
I have a list of dictionaries like below:
var result = new List<Dictionary<string, string>>
{
new Dictionary<string, string> { {"Continent", "Europe"}, {"Country", "Italy"}, {"Population", "10"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "RSA"}, {"Population", "20"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "RSA"}, {"Population", "30"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "Nigeria"}, {"Population", "40"} },
};
I would like to group by CONTINENT first, then group by COUNTRY and then within the result calculate a sum of POPULATION field, so in the example above:
- GROUP_1: Europe, Italy => Sum: 10
- GROUP_2: Africa, RSA => Sum: 20 + 30 = 50
- GROUP_3: Africa, Nigeria=> Sum: 40
I can do this with nested grouping, like below:
var resultGroupedByContinent = result.GroupBy(o => o["Continent"]);
foreach (var continent in resultGroupedByContinent )
{
var resultPerCountry = resultGroupedByContinent.GroupBy(o => o["Country"]);
foreach(country in resultPerCountry)
{
CalcualteSum
}
}
Is there any better way of implementing this, making better use of grouping, LINQ ?
In the real case scenario I will be dealing with:
- a list that has a total number of few million items (result.Count()=few million)
- each dictionary 70 key-values
- 4 level of grouping (Continent -> Country -> 3rd level -> 4th level)
I just want to add that List<Dictionary<string, string>> can be replaced with a different data structure, basically as a source I am getting following JSON, and then I am using json deserialiser:
{
"header": [ "Continent", "Country", "Population" ],
"errors": [],
"rows": [
[ "Europe", "Italy", "10" ],
[ "Africa", "RSA", "20" ],
[ "Africa", "RSA", "30" ],
[ "Africa", "Nigeria", "40" ]]
}
I will be getting such JSON with few million rows every 8 seconds, so that my cap for calculation time.
Solution 1:[1]
If you have this Dictionary<string, string> only, I would convert them to an defined object first and then group by like this
public class MyObject
{
public MyObject(Dictionary<string, string> dict)
{
Country = dict["Country"];
Continent = dict["Continent"];
Population = int.Parse(dict["Population"]);
}
public string Country { get; set; }
public string Continent { get; set; }
public int Population { get; set; }
}
--------------------------------------------------
var result = new List<Dictionary<string, string>>
{
new Dictionary<string, string> { {"Continent", "Europe"}, {"Country", "Italy"}, {"Population", "10"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "RSA"}, {"Population", "20"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "RSA"}, {"Population", "30"} },
new Dictionary<string, string> { {"Continent", "Africa"}, {"Country", "Nigeria"}, {"Population", "40"} },
};
var mapped = result.Select(x => new MyObject(x)).ToList();
var sum = mapped.GroupBy(x => new { x.Continent, x.Country }).ToDictionary(x => x.Key, x => x.Sum(y => y.Population));
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Demetrius Axenowski |
