'Calculate and classify the data in the collection list in java
There are the following entity classes:
@NoArgsConstructor
@AllArgsConstructor
@Data
class Log {
private String platform;
private LocalDateTime gmtCreate;
private Integer enlistCount;
private Integer dispatcherCount;
private Integer callbackCount;
}
Now, I have a list of 10,000 Log entity classes. I want to achieve the following effect. Group the data according to the gmtCreate field by each hour. At the same time, in each group, group by the platform field. Finally, , find the sum of the individual values (enlistCount, dispatcherCount, callbackCount) in these groups. The result looks like this:
Map<Integer, Map<String, Map<String, Integer>>> result = new HashMap<>();
/*
{
"23": {
"platform1": {
"callbackTotal": 66,
"dispatcherTotal": 77,
"enlistTotal": 33
},
"platform2": {
"callbackTotal": 13,
"dispatcherTotal": 5,
"enlistTotal": 64
}
},
"24": {
"platform2": {
"callbackTotal": 64,
"dispatcherTotal": 47,
"enlistTotal": 98
},
"platform7": {
"callbackTotal": 0,
"dispatcherTotal": 3,
"enlistTotal": 21
}
}
}
*/
The way I can think of is to use the stream to traverse and group multiple times, but I am worried that the efficiency is very low. Is there any efficient way to do it?
Solution 1:[1]
You can do it all in one stream by using groupingBy with a downstream collector and a mutable reduction with collect. You need a helper class to sum the values:
public class Total {
private int enlistCount = 0;
private int dispatcherCount = 0;
private int callbackCount = 0;
public Total addLog(Log log) {
this.enlistCount += log.enlistCount();
this.dispatcherCount += log.dispatcherCount();
this.callbackCount += log.callbackCount();
return this;
}
public Total add(Total that) {
this.enlistCount += that.enlistCount;
this.dispatcherCount += that.dispatcherCount;
this.callbackCount += that.callbackCount;
return this;
}
public Map<String, Integer> toMap() {
Map<String, Integer> map = new HashMap<>();
map.put("enlistTotal", enlistCount);
map.put("dispatcherTotal", dispatcherCount);
map.put("callbackTotal", callbackCount);
return map;
}
@Override
public String toString() {
return String.format("%d %d %d", enlistCount, dispatcherCount, callbackCount);
}
}
Then the stream, grouping, and collection looks like this:
logs.stream().collect(Collectors.groupingBy(Log::getPlatform,
Collectors.groupingBy(log -> log.getGmtCreate().getHour(),
Collector.of(Total::new, Total::addLog, Total::add, Total::toMap))))
To break that down, there's a groupingBy on the platform, and inside that a groupingBy on the hour of the day. Then all the log entries are summed by a collector which does a mutable reduction:
Collector.of(Total::new, Total::addLog, Total::add, Total::toMap)
This collector uses a supplier that provides a new Total with zeros for the counts, an accumulator that adds each Log's counts to the Total, a combiner that knows how to sum two Totals (this would only be used in a parallel scenario), and finally a finishing function that transforms the Total object to a Map<String, Integer>.
Solution 2:[2]
I would use a loop and make use of Map.computeIfAbsent and Map.merge. ComputeIfAbsent will determine if an entry is available for the given key. If so, it will return the value for that entry, otherwise it will create an entry with the supplied key and value and return that value. In this case, the value is another map. So the first time the hour is added. But it also returns the map just created for that hour. So another computeIfAbsent can be appended and the process repeated to get the platform map.
Map.merge is similar except that it will take a key, a value, and a merge function. If the value is not present for the key, the supplied value will be used. Otherwise, the merge function will be used to apply the new value to the map, merging with the existing one. In this case, the desired function is to add them so Integer::sum is used for this process.
I ran this for 100_000 copies of the supplied test data and it took about a second to run.
Here is some data
LocalDateTime ldt = LocalDateTime.now();
List<Log> list = new ArrayList<>(List.of(
new Log("platform1", ldt.plusHours(1), 66, 77, 33),
new Log("platform1", ldt.plusHours(1), 66, 77, 33),
new Log("platform2", ldt.plusHours(1), 13, 5, 64),
new Log("platform2", ldt.plusHours(2), 64, 47, 98),
new Log("platform7", ldt.plusHours(2), 0, 3, 21),
new Log("platform7", ldt.plusHours(2), 10, 15, 44)));
And here is the process.
- first create Map to hold the results.
Map<Integer, Map<String, Map<String, Integer>>> result =
new HashMap<>();
Now loop thru the list of logs and create the maps as you need them, summing up the values as you go.
for (Log log : list) {
Map<String, Integer> platformMap = result
.computeIfAbsent(log.getGMTCreate().getHour(),
v -> new HashMap<>())
.computeIfAbsent(log.getPlatform(),
v -> new HashMap<>());
platformMap.merge("callbackTotal", log.getCallbackCount(),
Integer::sum);
platformMap.merge("dispatcherTotal",
log.getDispatcherCount(), Integer::sum);
platformMap.merge("enlistTotal", log.getEnlistCount(),
Integer::sum);
}
Print the results.
result.forEach((k, v) -> {
System.out.println(k);
v.forEach((kk, vv) -> {
System.out.println(" " + kk);
vv.entrySet().forEach(
e -> System.out.println(" " + e));
});
});
prints
12
platform2
callbackTotal=64
enlistTotal=13
dispatcherTotal=5
platform1
callbackTotal=66
enlistTotal=132
dispatcherTotal=154
13
platform2
callbackTotal=98
enlistTotal=64
dispatcherTotal=47
platform7
callbackTotal=65
enlistTotal=10
dispatcherTotal=18
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | David Conrad |
| Solution 2 |
