'Update key name in a dictionary python
I have the following fasta file in a dictionary, in the following shape:
from Bio import SeqIO
alignment_file = '/Users/dissertation/Desktop/Alignment 4 sequences.fasta'
seq_dict = {rec.id : rec.seq for rec in SeqIO.parse(alignment_file, "fasta")}
Which gives me the following input:
{'NC_000962.3': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN'),
'NC_008596.1': Seq('------------------------------------------------------...ccg'),
'NC_009525.1': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN'),
'NC_002945.4': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN')}
The only issue here is that I would like to replace the key names for other than easier to identify when comparing the sequences to other parts of my code. So I have tried the following:
name_list = ['Tuberculosis', 'Smegmatis', 'H37Ra', 'Bovis']
for key in seq_dict:
for name in name_list:
seq_dict[name[x]]= seq_dict[key]
seq_dict
However I get the following error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
/var/folders/pq/ghtv3wj159j681vy0ny3tz9w0000gp/T/ipykernel_47822/1486954832.py in <module>
9
---> 10 for key in seq_dict:
11 for name in name_list:
12 seq_dict[name[x]]= seq_dict[key]
RuntimeError: dictionary changed size during iteration
I understand that there's not an easy straight forward way of updating key names values in a dictionary, but I don't understand the error. Is there a way of doing something similar?
I have also tried this:
seq_dict.update({'NC_000962.3': 'Tuberculosis', 'NC_008596.1': 'Smegmatis', 'NC_009525.1': 'H37Ra', 'NC_002945.4': 'Bovis'})
But this gives me the following output:
{'NC_000962.3': 'Tuberculosis',
'NC_008596.1': 'Smegmatis',
'NC_009525.1': 'H37Ra',
'NC_002945.4': 'Bovis'}
My desire output would look like this:
{'Tuberculosis': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN'),
'Smegmatis': Seq('------------------------------------------------------...ccg'),
'H37Ra': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN'),
'Bovis': Seq('ctgttaccgagatttcttcgtcgtttgttcttggaaagacagcgctggggatcg...NNN')}
Does anybody have an idea on how to update these?
Solution 1:[1]
Construct a new dictionary and then assign it to seq_dict in a single operation, rather than mutating seq_dict as you're in the process of iterating over it. I think this is what you're aiming for:
seq_dict = dict(zip(name_list, seq_dict.values()))
although I'd personally want to have an explicit mapping from sequence IDs to names rather than relying on the ordering being the same.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Samwise |
