'How can I compare the elements inside of a txt file in python?

I was doing a program in which I need to input some strings and then writes them on a file. What I have to do next is find the equal elements in the file. To do an example of what file I am handling:

Surname-Name-class
Surname2-Name2-class
Surname3-Name3-class2

This is the kind of file in which I have to compare the elements that compone it to see if there are any equal ones. In this case it should read that the class in line 2 and 3 are equal. Knowing that the others elements that compone the file could be of different lengths or could be the ones to be equal to each other, how can I compare those lines to see if there are equal elements? If there is involved changing the way the file is written, like putting spaces instead of "-" or everything else, please tell me. I use python 3.10. If you need more explenations, please ask. Thank you!



Solution 1:[1]

You could simply use open and read the file line by line into a list.

my_file = open("/PathToYourFile/test.txt", "r")
content = [element.replace('\n', '').split("-") for element in 
my_file.readlines()]
print(content)

This will print an array like this:

[['Surname', 'Name', 'class'], ['Surname2', 'Name2', 'class'], ['Surname3', 'Name3', 'class2']]

Now you can compare each element as they are separated in a nested list, each element to compare on the same index of each nested list no matter the character length.

Solution 2:[2]

I suggest using the csv module to parse the file, and putting the data into dicts keyed on the various field values.

import collections
import csv

fields = "surname", "name", "class"
counts = {field: collections.defaultdict(list) for field in fields}

with open("data.txt") as f:
    for n, line in enumerate(csv.DictReader(f, fields, delimiter='-'), 1):
        for field, value in line.items():
            counts[field][value].append(n)

for field, data in counts.items():
    for value, lines in data.items():
        if len(lines) > 1:
            print(f"{field} has value '{value}' on lines {lines}")

prints:

class has value 'class' on lines [1, 2]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tin Stribor Sohn
Solution 2 Samwise