'return a list of all amino acids sequences encoded by a rna sequence
I have a specific DNA sequence, and I need to return a list of all amino acids sequences encoded for that DNA sequence.
I also have a dictionary of all codons and their amino acids (single-letter strings). The * represents stop codons.
table = {
'ATA':'I', 'ATC':'I', 'ATT':'I', 'ATG':'M', 'ACA':'T', 'ACC':'T',
'ACG':'T', 'ACT':'T', 'AAC':'N', 'AAT':'N', 'AAA':'K', 'AAG':'K',
'AGC':'S', 'AGT':'S', 'AGA':'R', 'AGG':'R', 'CTA':'L', 'CTC':'L',
'CTG':'L', 'CTT':'L', 'CCA':'P', 'CCC':'P', 'CCG':'P', 'CCT':'P',
'CAC':'H', 'CAT':'H', 'CAA':'Q', 'CAG':'Q', 'CGA':'R', 'CGC':'R',
'CGG':'R', 'CGT':'R', 'GTA':'V', 'GTC':'V', 'GTG':'V', 'GTT':'V',
'GCA':'A', 'GCC':'A', 'GCG':'A', 'GCT':'A', 'GAC':'D', 'GAT':'D',
'GAA':'E', 'GAG':'E', 'GGA':'G', 'GGC':'G', 'GGG':'G', 'GGT':'G',
'TCA':'S', 'TCC':'S', 'TCG':'S', 'TCT':'S', 'TTC':'F', 'TTT':'F',
'TTA':'L', 'TTG':'L', 'TAC':'Y', 'TAT':'Y', 'TAA':'*', 'TAG':'*',
'TGC':'C', 'TGT':'C', 'TGA':'*', 'TGG':'W',
}
So, I need to get (RETURN) the list of strings, where each string represents a sequence of amino acids encoded by the DNA sequence.
He is my code
rna_sequence = rna_sequence.upper()
rna_seq_new = rna_sequence.replace("\n", "")
rna_seq_new = rna_seq_new.strip()
AA_list = []
nb = (len(rna_seq_new)) - 3
for i in range (0, len(rna_seq_new), 3):
codon = rna_seq_new[i:i+3]
if len(rna_seq_new):
return []
if codon == "AUG":
new = translate_sequence(rna_seq_new = rna_seq_new[i:], genetic_co>
AA_list.append(new)
return(AA_list)
By the moment I do not get anything and I do not know why. What I get is this AssertionError: Lists differ: [] != ['MTAVRYV', 'MTYV']
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
