'Collect the data from both Nested List and Outer List with Java Streams

Class Package{
  Long packageId;
  String packageName;
  List<Service> serviceList;
}

Class Service{
  Long serviceId;
  String location;
}

List <Package> packageList

Above there are my classes and the requirement is to collect packageId from the outer list packageList and serviceId from the inner list (the Services of each Package) where packageList.packageName == "Full" and serviceList.location == "Japan". Also, I need to know whether that kind of record exist or not. This is what I've written so far collecting the data into a HashMap.

HashMap<String,Object> stringObjectHashMap = new HashMap<>();
boolean isFound = packageList.stream()
                .filter(package -> "Full".equalsIgnoreCase(package.getPackageName()))
                .peek(package ->stringObjectHashMap.put("packageId",package.getPackageId()))
                .flatMap(package -> package.getServiceList().stream())
                .filter(service -> service.getLocation().equalsIgnoreCase("Japan"))
                .peek(service -> stringObjectHashMap.put("serviceId",service.getServiceId()))
                .anyMatch(service -> service.getLocation().equalsIgnoreCase("Japan"));

The problem is that Sonar complains about the use of peek() inside the stream. It says "According to its JavaDocs, the intermediate Stream operation java.util.Stream.peek() “exists mainly to support debugging” purposes."

Can anyone suggest a better solution?

sample input:

[
    {
        "packageId": 13,
        "packageName": "Normal",
        "serviceList": [
            {
                "serviceId": "100",
                "location": "China"
            }
        ]
    },
    {
        "packageId": 10,
        "packageName": "Full",
        "serviceList": [
            {
                "serviceId": "100",
                "location": "Spain"
            }
        ]
    },
     {
        "packageId": 5,
        "packageName": "Full",
        "serviceList": [
            {
                "serviceId": "100",
                "location": "Japan"
            }
        ]
    }
]

expected output,

"packageId": 5    //this is from outer list
"serviceId": 100  //this is from inner list

Note: "Full" package can appear multiple times. As well as location "Japan" also can appear multiple times in services. But a combination of the packageName with a value "Full" and location "Japan" can appear only once.



Solution 1:[1]

By using .flatMap(package -> package.getServiceList().stream()) you actually bring all the services for all the packages into one huge stream and get everything mixed up.

If I understand the task correctly, you want to select all packages with name "Full", and then, for each such package, select any service in that package's list with location "Japan", or null, if such service is not there. You can just collect to such a map like this:

Map<String,Long> stringObjectMap = packageList.stream().filter(p -> "Full".equalsIgnoreCase(p.getPackageName()))
            .collect(Collectors.toMap(
                    Package::getPackageId,
                    p -> p.getServiceList().stream()
                            .filter(service -> "Japan".equalsIgnoreCase(service.getLocation()))
                            .map(Service::getServiceId)
                            .findFirst().orElse(null) //you might want to use a stand-in default value here instead of null
            ));

Note: this will result in some obscure implementation of Map, not really a HashMap, so you might need to make adjustments.

Personally, I find such cases more easily solvable by using regular for-loops.

Solution 2:[2]

What you could do is filter through each of the packages to find all full packages, then perform a reduction on those packages to find services that are from japan.

        List<Package> fullPackages = packageList
                .stream()
                .filter(p -> p.getPackageName().equals("Full"))
                .toList();

        List<Service> fromJapan = fullPackages
                .stream()
                .flatMap(p -> p.getServiceList().stream())
                .filter(service -> service.getLocation().equalsIgnoreCase("Japan"))
                .toList();

Also, from what it looks like, the key is not unique, so it's going to continue to get overwritten.

Solution 3:[3]

Stream-based solution

Assuming that packageName with value of "Full" and a service location equal to "Japan" and every packageId is unique, we can create an intermediate map by mapping an optional of service (service located in Japan) to a packageId and generate a final result based on it.

Map<String, Long> ids = packageList.stream()
    .filter(pack -> "Full".equalsIgnoreCase(pack.getPackageName()))
    .collect(Collectors.toMap( // Map<Long, Optional<Service>> - intermediate map
        Package::getPackageId,
        pack -> pack.getServiceList().stream()
            .filter(service -> service.getLocation().equalsIgnoreCase("Japan"))
            .findFirst()))
    .entrySet().stream()
    .filter(entry -> entry.getValue().isPresent())
    .findFirst()
    .map(entry -> Map.of("packageId", entry.getKey(), 
                "serviceId", entry.getValue().get().getServiceId()))
    .orElse(Collections.emptyMap());

Can we do better? We can, currently, solution processes all the data which is not needed in case when the target package & service were found.

We can make the solution short-circuit by using findFirst() operation, which will also avoid double-checking (iterate over a list of services twice).

And it actually would be much more handy to use a simple DTO, for instance a Java 16 record with two attributes, rather than a map. But because your existing code depends on a map as a result I'll continue with map, and internally as an intermediate mean of string data I'll use Map.Entry.

Map<String, Long> ids = packageList.stream()
    .filter(pack -> "Full".equalsIgnoreCase(pack.getPackageName()))
    .map(pack -> Map.entry(pack.getPackageId(), pack.getServiceList().stream()
        .filter(service -> service.getLocation().equalsIgnoreCase("Japan"))
        .findFirst()))
    .filter(entry -> entry.getValue().isPresent())
    .findFirst()   // <- will produce an intermediate result as Optional<Map.Entry<Long, <Optional<Service>>>
    .map(entry -> Map.of("packageId", entry.getKey(), 
                "serviceId", entry.getValue().get().getServiceId()))
    .orElse(Collections.emptyMap());

This version is more performant. It'll terminate the stream execution when the combination of "Full" package with a service located in "Japan" will be encountered, instead of processing all data set.

It's much better now, but one issue we can't resolve - it is not possible to eliminate the nested stream (because we need to iterate over the list of services) which spoils the readability.

Reminder: streams were introduces in Java as a mean of structuring the code in a simple well-readable way, not the opposite. Old plain loops perform better than sequential streams.

Imperative solution

Now let's implement the same logic using an imperative approach.

Map<String, Long> result = new HashMap<>();

for (Package pack: packageList) {
    if (!pack.getPackageName().equalsIgnoreCase("Full")) continue;
            
    for (Service service: pack.getServiceList()) {
        if (service.getLocation().equalsIgnoreCase("Japan")) {
            result.put("packageId", pack.getPackageId());
            result.put("serviceId", service.getServiceId());
            break;
        }
    }
}

Conditional logic is much easier to implement using plain loops. Very concise, easy to read, easy to maintain.

And as a result, imperative solution is not only much cleaner, but also more performant because doesn't require entails unnecessary actions: map will be updated only once.

You can also compare it with your initial approach that uses side effects via peek() which continuously updates the map.

Solution 4:[4]

If you're using Java 12 or above, you could achieve your goal by using the teeing operation to create a first downstream of Packages and a second downstream of Services. Both of them would return a Map<String, Long> containing the fixed Strings with the expected ids. Also, if you need to know if the record has been found you could check the number of entries of your returned Map.

Map<String, Long> mapRes = packageList.stream()
        .filter(pack -> pack.getPackageName().equalsIgnoreCase("Full") && pack.getServiceList().stream().anyMatch(s -> s.getLocation().equalsIgnoreCase("Japan")))
        .collect(Collectors.teeing(
                Collectors.toMap(p -> "packageId", Package::getPackageId, (id1, id2) -> id1),
                Collectors.flatMapping(pack -> pack.getServiceList().stream(), Collectors.toMap(s -> "serviceId", Service::getServiceId, (id1, id2) -> id1)),
                (map1, map2) -> {
                    map1.putAll(map2);
                    return map1;
                }));

boolean isFound = mapRes.entrySet().size() == 2;

Alternatively, if Sonar is signaling you the abuse of the peek operation and you still want to employ stateful lambdas although not advisable, then you could replace each peek with a map doing the exact operations and then returning the given element. This is your original code with the peek replaced by map.

boolean isFound = packageList.stream()
        .filter(pack -> "Full".equalsIgnoreCase(pack.getPackageName()))
        .map(pack -> {
            stringObjectHashMap.put("packageId", pack.getPackageId());
            return pack;
        })
        .flatMap(pack -> pack.getServiceList().stream())
        .filter(service -> service.getLocation().equalsIgnoreCase("Japan"))
        .map(service -> {
            stringObjectHashMap.put("serviceId", service.getServiceId());
            return service;
        })
        .anyMatch(service -> service.getLocation().equalsIgnoreCase("Japan"));

Here there is also a link to test the code:

https://ideone.com/uV5PIQ

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4