'Reading a file in list of list form and pairing the rows of that in python

I have a file say file1.rule, which has even number of rows, the last column of that file represent fitness and the second last column represent the class. I want to pair the rows class wise(first picks the row with the highest fitness then a random one from the remaining), with just one condition that no two identical rows can form a pair. In my file, no two exactly identical row for a class can occur more than n/2 times where n is the number of rows for that particular class.

Below is my file:

*,*,*,1,0,1.0
*,*,1,*,0,0.22
*,*,2,2,1,0.71
*,*,2,2,1,0.71
*,2,2,*,1,0.64
*,2,2,*,1,0.64
1,*,*,3,2,0.95
*,*,3,2,2,0.66
*,*,3,4,2,0.67
3,*,*,*,2,0.33
3,*,*,*,2,0.33
3,*,*,*,2,0.33

And the code for this :

rule_file_name = "file1.rule"
from collections import defaultdict
list1 = []

with open(rule_file_name) as rule_fp:
    for line in rule_fp.readlines():
        list1.append(line.replace("\n","").split(","))

assert len(list1) & 1 == 0
classes = defaultdict(list)
for _list in list1:
    classes[_list[4]].append(_list)

    
from random import sample, seed
seed(1)
for key, _list in classes.items():
    assert len(_list) & 1 == 0
    _list.sort(key=lambda x: x[5])
    pairs = []
    #while(len(_list)>2):
    while _list:
        #print(len(_list))
        first = _list[-1]
        candidate = sample(_list, 1)[0]
        if first != candidate:
            #print(f'first{first}, candidate{candidate}')
            print(f'{first},{candidate}')
            pairs.append((first, candidate))
            _list.remove(first)
            _list.remove(candidate)
    classes[key] = pairs

The above code is working fine for class 0 and 1 and pairing is done but for class 2, the first 2 randomly chosen pairs are :

['1', '*', '*', '3', '2', '0.95'],['*', '*', '3', '2', '2', '0.66']
['*', '*', '3', '4', '2', '0.67'],['3', '*', '*', '*', '2', '0.33']

Now after these the remaining 2 rows of class 2 are: 3,*,*,*,2,0.33 and 3,*,*,*,2,0.33 which are identical so they can't form a pair and hence the while loop is running for infinite times.

According to my observation, this condition will only arrive when there are only last 2 rows left for any class, in this case, I simply want to discard those 2 rows. So I tried to replace the while condition writing: while(len(_list)>2): , but in this case the last 2 will always be ignored even if they are completely different from each other. What to do?

Edit1: Can I use any timer inside the while loop?

if some_condition or time.time() > timeout:
        break

can I do something like that? Please help me out.

Edit2: I tried to modify the code like:

while _list:
    first = _list[-1]
    _list.remove(first)
    candidate = sample(_list, 1)[0]
    if (len(_list)<=2) and first == candidate:
        break
    elif first != candidate:
        #print(f'first{first}, candidate{candidate}')
        print(f'{first},{candidate}')
        pairs.append((first, candidate))
        #_list.remove(first)
        _list.remove(candidate)
classes[key] = pairs

But in this, I am getting error in candidate = sample(_list, 1)[0] this line saying: ValueError: Sample larger than population or is negative



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source