'Removing duplicates from a list of tuples
I have a list of Tuple<string,string> objects and I want to remove duplicates where, for example, the tuples (a,b) and (b,a) are considered the same (these are the edges of a graph). What is a nice way to do this ?
Solution 1:[1]
You might need to create a class that implements IEqualityComparer<Tuple<string, string>>:
public class TupleComparer : IEqualityComparer<Tuple<string, string>>
{
public bool Equals(Tuple<string, string> x, Tuple<string, string> y)
{
if (ReferenceEquals(x, y))
{
return true;
}
if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
{
return false;
}
if (x.Item1.Equals(y.Item2) && x.Item2.Equals(y.Item1))
{
return true;
}
return x.Item1.Equals(y.Item1) && x.Item2.Equals(y.Item2);
}
public int GetHashCode(Tuple<string, string> tuple)
{
// implementation
}
}
You could then use the Distinct() LINQ method like this:
List<Tuple<string, string>> list = new List<Tuple<string, string>> { Tuple.Create("a", "b"), Tuple.Create("a", "c"), Tuple.Create("b", "a") };
var result = list.Distinct(new TupleComparer());
Solution 2:[2]
try using a dictionary and making up a key that denotes each tuple. do you have a character that WON'T appear in your strings, that you can use as a delimiter? I chose ":" in this example:
static void Main(string[] args)
{
// original list of data
var list = new List<Tuple<string, string>> { };
list.Add(new Tuple<string, string>("a", "b"));
list.Add(new Tuple<string, string>("b", "a"));
// dictionary to hold unique tuples
var dict = new Dictionary<string, Tuple<string, string>>();
foreach (var item in list)
{
var key1 = string.Concat(item.Item1, ":", item.Item2);
var key2 = string.Concat(item.Item2, ":", item.Item1);
// if dict doesnt contain tuple, add it.
if (!dict.ContainsKey(key1) && !dict.ContainsKey(key2))
dict.Add(key1, item);
}
// print unique tuples
foreach (var item in dict)
{
var tuple = item.Value;
Console.WriteLine(string.Concat(tuple.Item1, ":", tuple.Item2));
}
Console.ReadKey();
}
Solution 3:[3]
To preserve the original, use group by instead of Distinct so we can still access the first element of the group:
Live code: https://dotnetfiddle.net/LYZItb
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
{
Tuple.Create<string, string>("B", "A"),
Tuple.Create<string, string>("A", "B"), // duplicate
Tuple.Create<string, string>("C", "B"),
Tuple.Create<string, string>("C", "B"), // duplicate
Tuple.Create<string, string>("A", "D"),
Tuple.Create<string, string>("E", "F"),
Tuple.Create<string, string>("F", "E"), // duplicate
};
public static void Main()
{
var result =
from y in
from x in myList
select new { Original = x, SortedPair = new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray() }
group y by new { NormalizedTuple = Tuple.Create<string,string>(y.SortedPair[0], y.SortedPair[1]) } into grp
select new { Pair = grp.Key.NormalizedTuple, Original = grp.First().Original };
foreach(var item in result)
{
Console.WriteLine("Pair: {0} {1}", item.Original.Item1, item.Original.Item2);
}
}
}
Output:
Pair: B A
Pair: C B
Pair: A D
Pair: E F
Solution 4:[4]
Live Code: https://dotnetfiddle.net/LUErFj
Do it by sorting the Tuple pair first, then do a Distinct:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
static List<Tuple<string, string>> myList = new List<Tuple<string, string>>()
{
Tuple.Create<string, string>("A", "B"),
Tuple.Create<string, string>("B", "A"), // duplicate
Tuple.Create<string, string>("C", "B"),
Tuple.Create<string, string>("C", "B"), // duplicate
Tuple.Create<string, string>("A", "D")
};
public static void Main()
{
myList
.Select(x => new[] { x.Item1, x.Item2 }.OrderBy(s => s).ToArray())
.Select(x => Tuple.Create<string,string>(x[0], x[1]))
.Distinct()
.Dump();
}
}
Output:
Dumping object(System.Linq.<DistinctIterator>d__81`1[Tuple`2[String,String]])
[
{
Item1 : A
Item2 : B
ToString(): (A, B)
},
{
Item1 : B
Item2 : C
ToString(): (B, C)
},
{
Item1 : A
Item2 : D
ToString(): (A, D)
}
]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | chomba |
| Solution 2 | plukich |
| Solution 3 | |
| Solution 4 |
