'Removing duplicates from a list of tuples

I have a list of Tuple<string,string> objects and I want to remove duplicates where, for example, the tuples (a,b) and (b,a) are considered the same (these are the edges of a graph). What is a nice way to do this ?



Solution 1:[1]

You might need to create a class that implements IEqualityComparer<Tuple<string, string>>:

public class TupleComparer : IEqualityComparer<Tuple<string, string>>
{
    public bool Equals(Tuple<string, string> x, Tuple<string, string> y)
    {

        if (ReferenceEquals(x, y))
        {
            return true;
        }

        if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
        {
            return false;
        }

        if (x.Item1.Equals(y.Item2) && x.Item2.Equals(y.Item1))
        {
            return true;
        }

        return x.Item1.Equals(y.Item1) && x.Item2.Equals(y.Item2);
    }   

    public int GetHashCode(Tuple<string, string> tuple)
    {
        // implementation
    }
}

You could then use the Distinct() LINQ method like this:

List<Tuple<string, string>> list = new List<Tuple<string, string>> { Tuple.Create("a", "b"), Tuple.Create("a", "c"), Tuple.Create("b", "a") };
var result = list.Distinct(new TupleComparer());

Solution 2:[2]

try using a dictionary and making up a key that denotes each tuple. do you have a character that WON'T appear in your strings, that you can use as a delimiter? I chose ":" in this example:

static void Main(string[] args)
{
    // original list of data
    var list = new List<Tuple<string, string>> { };
    list.Add(new Tuple<string, string>("a", "b"));
    list.Add(new Tuple<string, string>("b", "a"));

    // dictionary to hold unique tuples
    var dict = new Dictionary<string, Tuple<string, string>>();
    foreach (var item in list)
    {
        var key1 = string.Concat(item.Item1, ":", item.Item2);
        var key2 = string.Concat(item.Item2, ":", item.Item1);

        // if dict doesnt contain tuple, add it.
        if (!dict.ContainsKey(key1) && !dict.ContainsKey(key2))
            dict.Add(key1, item);
    }

    // print unique tuples
    foreach (var item in dict)
    {
        var tuple = item.Value;
        Console.WriteLine(string.Concat(tuple.Item1, ":", tuple.Item2));
    }

    Console.ReadKey();
}

Solution 3:[3]

To preserve the original, use group by instead of Distinct so we can still access the first element of the group:

Live code: https://dotnetfiddle.net/LYZItb

using System;
using System.Collections.Generic;

using System.Linq;

public class Program
{
    static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
    {
        Tuple.Create<string, string>("B", "A"),
        Tuple.Create<string, string>("A", "B"), // duplicate
        Tuple.Create<string, string>("C", "B"),
        Tuple.Create<string, string>("C", "B"), // duplicate
        Tuple.Create<string, string>("A", "D"),
        Tuple.Create<string, string>("E", "F"),
        Tuple.Create<string, string>("F", "E"), // duplicate        
    };

    public static void Main()
    {

        var result =
            from y in 
                from x in myList
                select new { Original = x, SortedPair = new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray() }  
                group y by new { NormalizedTuple = Tuple.Create<string,string>(y.SortedPair[0], y.SortedPair[1]) } into grp
            select new { Pair = grp.Key.NormalizedTuple, Original = grp.First().Original };


        foreach(var item in result)
        {
            Console.WriteLine("Pair: {0} {1}", item.Original.Item1, item.Original.Item2);
        }
    }
}

Output:

Pair: B A
Pair: C B
Pair: A D
Pair: E F

Solution 4:[4]

Live Code: https://dotnetfiddle.net/LUErFj

Do it by sorting the Tuple pair first, then do a Distinct:

using System;
using System.Collections.Generic;

using System.Linq;

public class Program
{
    static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
    {
        Tuple.Create<string, string>("A", "B"),
        Tuple.Create<string, string>("B", "A"), // duplicate
        Tuple.Create<string, string>("C", "B"),
        Tuple.Create<string, string>("C", "B"), // duplicate
        Tuple.Create<string, string>("A", "D")              
    };

    public static void Main()
    {
        myList
            .Select(x => new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray())
            .Select(x => Tuple.Create<string,string>(x[0], x[1]))
            .Distinct()
            .Dump();
    }
}

Output:

Dumping object(System.Linq.<DistinctIterator>d__81`1[Tuple`2[String,String]])
[
   {
   Item1  : A
   Item2  : B
   ToString(): (A, B)
   },
   {
   Item1  : B
   Item2  : C
   ToString(): (B, C)
   },
   {
   Item1  : A
   Item2  : D
   ToString(): (A, D)
   }
]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 chomba
Solution 2 plukich
Solution 3
Solution 4