'How to remove duplicates lines from text file while keeping order of file

I have a text file with 105,779 lines containing duplicates. I need to remove the duplicates without changing the order. I am able to remove the duplicates, but it is changing the order of the text file. Any help would be appreciated.



Solution 1:[1]

The Select-Object cmdlet has a -Unique switch that does just what you want:

Get-Content allMethods.txt | Select-Object -Unique > uniqueMethods.txt

Note (written as of PowerShell 7.2.3):

  • While Select-Object -Unique is surprisingly slow, it still outperforms your own manual solution.

  • Unlike PowerShell in general, -Unique is surprisingly and invariably case-sensitive:

    'a', 'a', 'b' | Select-Object -Unique # -> 'a', 'b'
    
    'a', 'A', 'b' | Select-Object -Unique # !! -> 'a', 'A', 'b'
    
    • See GitHub issue #12059 for a discussion of this unexpected behavior. (The expected behavior would be to case-insensitive by default and offer a case-sensitive opt-in).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mklement0