'How to do groupBy and filter based on maxDate record in Java List
I have a java POJO for collecting metics like below :
public class Metric {
Long metricId;
Long resultKeyId;
@NonNull DatasetType datasetType;
@NonNull String datasetName;
@NonNull String analyzerName;
@NonNull String constraintAlias;
@NonNull LocalDateTime entityDate;
@NonNull long entityDurationSec;
@NonNull Double metricValue;
@NonNull String changedBy;
Long jobId = 0L;
Long codeArtifactId = 0L;
LocalDateTime createdAt;
LocalDateTime lastChanged;
}
I have a list of metrics from the above pojo like List<Metric> metrics
Now this list can have multiple items and i want to select only one record for the same resultKeyId,datasetType,datasetName,analyzerName,constraintAlias with the max createdAt
The SQL Representation of this would be something like :
select a.* from
dataval_metric a
join dataval_metric b
on a.result_key_id=b.result_key_id
and a.dataset_type=b.dataset_type
and a.dataset_name=b.dataset_name
and a.analyzer_name=b.analyzer_name
and a.constraint_alias=b.constraint_alias
where a.result_key_id = 434
and a.mysql_row_created_at >= b. mysql_row_created_at;
Looking for pointers to understand how this can be done in a performant way in Java
Solution 1:[1]
You have to use gropingBy method using the fields as key.
The key can be:
- a List:
Map<List<Object>, Optional<Metric>> map = metrics.stream()
.collect(Collectors.groupingBy(m ->
List.of(m.getResultKeyId(),
m.getDatasetType(),
m.getDatasetName(),
m.getAnalyzerName(),
m.getConstraintAlias()),
Collectors.maxBy(Comparator.comparing(Metric::getCreatedAt))));
- an object of type Metric if you override the method
equalsandhashCodebased just on the fields you want:
Map<Metric, Optional<Metric>> map = metrics.stream()
.collect(Collectors.groupingBy(m -> m,
Collectors.maxBy(Comparator.comparing(Metric::getCreatedAt))));
- another object with
equalsandhascodeoverridden like Quintent of the library javatuples
Map<Quintet, Optional<Metric>> map = metrics.stream()
.collect(Collectors.groupingBy(m ->
new Quintet(m.getResultKeyId(),
m.getDatasetType(),
m.getDatasetName(),
m.getAnalyzerName(),
m.getConstraintAlias()),
Collectors.maxBy(Comparator.comparing(Metric::getCreatedAt))));
Solution 2:[2]
we can use Collectors.toMap, which maps the key represented as a record MetricKey (basically this is just a tuple of the fields you need to group by) to a Metric. Since toMap doesn't allow duplicates, we also provide the merge function which always keeps metric with a maximum createdDate in the map.
So I would propose to add the getKey method to the Metric class so that it returns the key as a record or as a custom class which overrides equals and hashCode.
class Metric
{
// ... all your fields
record MetricKey(Long resultKeyId, String analyzerName,
DatasetType datasetType, String datasetName, String constraintAlias) { }
public MetricKey getKey() {
return new MetricKey(resultKeyId, datasetType, datasetName,
analyzerName, constraintAlias);
}
public LocalDateTime getCreatedAt() {
return createdAt;
}
}
And the data processing pipeline:
List<Metric> maximums = new ArrayList<>(metrics.stream().collect(
Collectors.toMap(
Metric::getKey,
Function.identity(),
(m1, m2) -> m1.createdAt > m2.createdAt ? m1 : m2))
.values());
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | frascu |
| Solution 2 |
