'Finding set entries of map which has common values

I have a collection as follows

Map<String, Set<Long>> myMap = new HashMap<>();

I want to find out if any entry in this map has set which is contained in another entry of same map.

For example, lets say map has the following 5 entries

a - {1, 2, 3}
b - {4, 5}
c - {1}
d - {2, 3}
e - {5}
f - {6}

So, it has the following overlapping entries as set maybe

a - {1, 2, 3}  and c - {1} 
b - {4, 5}     and e - {5}
a - {1, 2, 3}  and d - {2, 3}

Or just list of Set for keys like

a and c
b and e
a and d

I could iterate each of the keyset and then use disjoint or anyMatch for each set, but I was wondering if there is an optimized way (Java 8, 9, 10, 11).



Solution 1:[1]

Sort the Map by number of elements in the Set in descending order, i.e. first entry in sorted Map contains the Set with the most elements. In your example, this would be the following entry:

a - {1, 2, 3}

After sorting, take the second entry and see if its Set is a subset of the first entry's Set. The second entry would be either

b - {4, 5}

or

d - {2, 3}

Then take the third entry and see if its Set is a subset of the second entry or the first entry. Continue until you have reached the last entry.

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

public class Subsets {

    public static void main(String args[]) {
        Map<String, Set<Long>> myMap = Map.of("a", Set.of(1L, 2L, 3L),
                                              "b", Set.of(4L, 5L),
                                              "c", Set.of(1L),
                                              "d", Set.of(2L, 3L),
                                              "e", Set.of(5L),
                                              "f", Set.of(6L));
        List<String> sortedKeys = myMap.entrySet()
                                       .stream()
                                       .sorted((e1, e2) -> (e2.getValue() == null ? 0 : e2.getValue().size()) - (e1.getValue() == null ? 0 : e1.getValue().size()))
                                       .map(e -> e.getKey())
                                       .collect(Collectors.toList());
        String[] keyArray = new String[sortedKeys.size()];
        sortedKeys.toArray(keyArray);
        List<String[]> subsets = new ArrayList<>();
        for (int i = 1; i < keyArray.length; i++) {
            Set<Long> source = myMap.get(keyArray[i]);
            for (int j = i - 1; j >= 0; j--) {
                Set<Long> target = myMap.get(keyArray[j]);
                if (target.containsAll(source)) {
                    subsets.add(new String[]{keyArray[i], keyArray[j]});
                }
            }
        }
        subsets.forEach(arr -> System.out.println(Arrays.toString(arr)));
    }
}

Output produced when above code is executed as follows:

[d, a]
[e, b]
[c, a]

In other words:

  • d is a subset of a
  • e is a subset of b
  • c is a subset of a

Total number of iterations in the nested for loops is 15, i.e. O(nlog2n) as opposed to 36 iterations, i.e. O(n2) in the other answer.

Note that if you can create the initial Map in sorted order then there is no need to perform the sort. One way to do this would be to use LinkedHashMap as your Map implementation.

myMap = new LinkedHashMap<>();
myMap.put("a", Set.of(1L, 2L, 3L));
myMap.put("b", Set.of(4L, 5L));
myMap.put("d", Set.of(2L, 3L));
myMap.put("c", Set.of(1L));
myMap.put("e", Set.of(5L));
myMap.put("f", Set.of(6L));

Solution 2:[2]

import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class Main {
    public static void main(String[] args) {
        Map<String, Set<Long>> map = new HashMap<>();
        map.put("a", Set.of(1l, 2l, 3l));
        map.put("b", Set.of(4l, 5l));
        map.put("c", Set.of(1l));
        map.put("d", Set.of(2l, 3l));
        map.put("e", Set.of(5l));
        map.put("f", Set.of(6l));

        Set<Map.Entry<String, String>> result = map.entrySet()
                .stream().map(
                        (source) -> map.entrySet().stream()
                                .takeWhile((pair) -> pair.getValue() != source.getValue())
                                .filter((pair) -> pair.getValue().containsAll(source.getValue()))
                                .map((pair) -> Map.entry(pair.getKey(), source.getKey()))
                ).reduce(Stream.empty(), Stream::concat).collect(Collectors.toSet());
        System.out.println(result);
    }

First, the entire map is iterated over and converted into a Stream<Stream<Map.Entry<String, String>>. By reduce(Stream.empty(), Stream::concat).collect(Collectors.toSet()); this becomes the output type in which all streams are merged together. To get the individual Stream<Map.Entry<String, String>> for each entry in the source map, we again iterate over each entry in the source list, but this time takeWhile((pair) -> pair.getValue() != source.getValue()) ignores all values that have not already appeared in the first loop, thus preventing duplicate results like (a, b) and (b, a) and also those that contain the same value twice like (a, a). Now pair.getValue().containsAll(source.getValue()) can be used to check if the entry from the inner stream contains the entry from the outer stream and remove it from the stream if there is no match, this result then only has to be converted into a Stream<Map.Entry<String, String>>, which is achieved by map((pair) -> Map.entry(pair.getKey(), source.getKey())).

The output of this code is:

[a=c, a=d, b=e]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Goldenprime