'How to remove duplicates strings or int from Slice in Go

Let's say I have a list of student cities and the size of it could be 100 or 1000, and I want to filter out all duplicates cities.

I want a generic solution that I can use to remove all duplicate strings from any slice.

I am new to Go Language, So I tried to do it by looping and checking if the element exists using another loop function.

Students' Cities List (Data):

studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

Functions that I created, and it's doing the job:

func contains(s []string, e string) bool {
    for _, a := range s {
        if a == e {
            return true
        }
    }
    return false
}

func removeDuplicates(strList []string) []string {
    list := []string{}
    for _, item := range strList {
        fmt.Println(item)
        if contains(list, item) == false {
            list = append(list, item)
        }
    }
    return list
}

My solution test

func main() {
    studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

    uniqueStudentsCities := removeDuplicates(studentsCities)
    
    fmt.Println(uniqueStudentsCities) // Expected output [Mumbai Delhi Ahmedabad Bangalore Kolkata Pune]
}

I believe that the above solution that I tried is not an optimum solution. Therefore, I need help from you guys to suggest the fastest way to remove duplicates from the slice?

I checked StackOverflow, this question is not being asked yet, so I didn't get any solution.



Solution 1:[1]

I found Burak's and Fazlan's solution helpful. Based on that, I implemented the simple functions that help to remove/filter duplicates data from slices of strings, integers, etc.

Here are my two functions, one for strings and another one for integers of slices. You have to pass your data and return all the unique values as a result.

To remove duplicate strings from slice:

func removeDuplicateStr(strSlice []string) []string {
    allKeys := make(map[string]bool)
    list := []string{}
    for _, item := range strSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

To remove duplicate integers from slice:

func removeDuplicateInt(intSlice []int) []int {
    allKeys := make(map[int]bool)
    list := []int{}
    for _, item := range intSlice {
        if _, value := allKeys[item]; !value {
            allKeys[item] = true
            list = append(list, item)
        }
    }
    return list
}

You can update the slice type, and it will filter out all duplicates data for all types of slices.

Here is the GoPlayground link: https://play.golang.org/p/IyVWlWRQM89

Solution 2:[2]

You can do in-place replacement guided with a map:

processed := map[string]struct{}{}
w := 0
for _, s := range cities {
    if _, exists := processed[s]; !exists {
        // If this city has not been seen yet, add it to the list
        processed[s] = struct{}{}
        cities[w] = s
        w++
    }
}
cities = cities[:w]

Solution 3:[3]

Adding this answer which worked for me, does require/include sorting, however.

func removeDuplicateStrings(s []string) []string {
    if len(s) < 1 {
        return s
    }

    sort.Strings(s)
    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

For fun, I tried using generics! (Go 1.18+ only)

type SliceType interface {
    ~string | ~int | ~float64 // add more *comparable* types as needed
}

func removeDuplicates[T SliceType](s []T) []T {
    if len(s) < 1 {
        return s
    }

    // sort
    sort.SliceStable(s, func(i, j int) bool {
        return s[i] < s[j]
    })

    prev := 1
    for curr := 1; curr < len(s); curr++ {
        if s[curr-1] != s[curr] {
            s[prev] = s[curr]
            prev++
        }
    }

    return s[:prev]
}

Go Playground Link with tests: https://go.dev/play/p/bw1PP1osJJQ

Solution 4:[4]

If you want to don't waste memory allocating another array for copy the values, you can remove in place the value, as following:

package main

import "fmt"

var studentsCities = []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

func contains(s []string, e string) bool {
    for _, a := range s {
        if a == e {
            return true
        }
    }
    return false
}

func main() {
    fmt.Printf("Cities before remove: %+v\n", studentsCities)
    for i := 0; i < len(studentsCities); i++ {
        if contains(studentsCities[i+1:], studentsCities[i]) {
            studentsCities = remove(studentsCities, i)
            i--
        }
    }
    fmt.Printf("Cities after remove: %+v\n", studentsCities)
}
func remove(slice []string, s int) []string {
    return append(slice[:s], slice[s+1:]...)
}

Result:

Cities before remove: [Mumbai Delhi Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]
Cities after remove: [Ahmedabad Mumbai Bangalore Delhi Kolkata Pune]

Solution 5:[5]

It can also be done with a set-like map:

ddpStrings := []string{}
m := map[string]struct{}{}

for _, s := range strings {
    if _, ok := m[scopeStr]; ok {
        continue
    }
    ddpStrings = append(ddpStrings, s)
    m[s] = struct{}{}
}

Solution 6:[6]

func UniqueNonEmptyElementsOf(s []string) []string {
    unique := make(map[string]bool, len(s))
    var us []string
    for _, elem := range s {
        if len(elem) != 0 {
            if !unique[elem] {
                us = append(us, elem)
                unique[elem] = true
            }
        }
    }

    return us
}

send the duplicated splice to the above function, this will return the splice with unique elements.

func main() {
    studentsCities := []string{"Mumbai", "Delhi", "Ahmedabad", "Mumbai", "Bangalore", "Delhi", "Kolkata", "Pune"}

    uniqueStudentsCities := UniqueNonEmptyElementsOf(studentsCities)
    
    fmt.Println(uniqueStudentsCities)
}

Solution 7:[7]

Here's a mapless index based slice's duplicate "remover"/trimmer. It use a sort method.

The n value is always 1 value lower than the total of non duplicate elements that's because this methods compare the current (consecutive/single) elements with the next (consecutive/single) elements and there is no matches after the lasts so you have to pad it to include the last.

Note that this snippet doesn't empty the duplicate elements into a nil value. However since the n+1 integer start at the duplicated item's indexes, you can loop from said integer and nil the rest of the elements.

sort.Strings(strs)
for n, i := 0, 0; ; {
    if strs[n] != strs[i] {
        if i-n > 1 {
            strs[n+1] = strs[i]
        }
        n++
    }
    i++
    if i == len(strs) {
        if n != i {
            strs = strs[:n+1]
        }
        break
    }
}
fmt.Println(strs)

Solution 8:[8]

Simple to understand.

func RemoveDuplicate(array []string) []string {
    m := make(map[string]string)
    for _, x := range array {
        m[x] = x
    }
    var ClearedArr []string
    for x, _ := range m {
        ClearedArr = append(ClearedArr, x)
    }
    return ClearedArr
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Riyaz Khan
Solution 2 gonutz
Solution 3 snassr
Solution 4 alessiosavi
Solution 5 leonheess
Solution 6 Musab Gultekin
Solution 7
Solution 8 Spoofed