'mRNA to form protein

I am struggling with completing a code. I am fairly new to coding but I need to submit a test with this question:

In order for protein to be formed, a chain of amino acids must be formed. These amino acids are formed with 3 base pairs. An example would be 'CUU' makes a Leucine ('Leu') amino acid. Remember that there are stop codons UAG, UGA, UAA which essentially ends the protein synthesis formation. This leaves you with a chain of amino acids that will be folded into a protien which hopefully becomes part of your python brain tissue!

The function to be built, amino_acids, must return a list of a tuple and an integer when given a string of mRNA code. The first tuple must contain all the amino acids and the integer must be the number of distinct amino acids. You can use the dictionary below to help with your function. The function must also not include the stop codon codes.

NOTE : For this piece of code, we'll assume that there's only one stop codon in the sequence

{'CUU': 'Leu', 'UAG': '---', 'ACA': 'Thr', 'AAA': 'Lys', 'AUC': 'Ile', 'AAC': 'Asn','AUA': 'Ile', 'AGG': 'Arg', 'CCU': 'Pro', 'ACU': 'Thr', 'AGC': 'Ser','AAG': 'Lys', 'AGA': 'Arg', 'CAU': 'His', 'AAU': 'Asn', 'AUU': 'Ile','CUG': 'Leu', 'CUA': 'Leu', 'CUC': 'Leu', 'CAC': 'His', 'UGG': 'Trp','CAA': 'Gln', 'AGU': 'Ser', 'CCA': 'Pro', 'CCG': 'Pro', 'CCC': 'Pro', 'UAU': 'Tyr', 'GGU': 'Gly', 'UGU': 'Cys', 'CGA': 'Arg', 'CAG': 'Gln', 'UCU': 'Ser', 'GAU': 'Asp', 'CGG': 'Arg', 'UUU': 'Phe', 'UGC': 'Cys', 'GGG': 'Gly', 'UGA':'---', 'GGA': 'Gly', 'UAA': '---', 'ACG': 'Thr', 'UAC': 'Tyr', 'UUC': 'Phe', 'UCG': 'Ser', 'UUA': 'Leu', 'UUG': 'Leu', 'UCC': 'Ser', 'ACC': 'Thr', 'UCA': 'Ser', 'GCA': 'Ala', 'GUA': 'Val', 'GCC': 'Ala', 'GUC': 'Val', 'GGC':'Gly', 'GCG': 'Ala', 'GUG': 'Val', 'GAG': 'Glu', 'GUU': 'Val', 'GCU': 'Ala', 'GAC': 'Asp', 'CGU': 'Arg', 'GAA': 'Glu', 'AUG': 'Met', 'CGC': 'Arg'}

I started working on a function, could anyone help me correct it to solve this code?

def amino_acids(mrna): my_string = " " my_dict = {'CUU': 'Leu', 'UAG': '---', 'ACA': 'Thr', 'AAA': 'Lys', 'AUC': 'Ile', 'AAC': 'Asn','AUA': 'Ile', 'AGG': 'Arg', 'CCU': 'Pro', 'ACU': 'Thr', 'AGC': 'Ser','AAG': 'Lys', 'AGA': 'Arg', 'CAU': 'His', 'AAU': 'Asn', 'AUU': 'Ile','CUG': 'Leu', 'CUA': 'Leu', 'CUC': 'Leu', 'CAC': 'His', 'UGG': 'Trp','CAA': 'Gln', 'AGU': 'Ser', 'CCA': 'Pro', 'CCG': 'Pro', 'CCC': 'Pro', 'UAU': 'Tyr', 'GGU': 'Gly', 'UGU': 'Cys', 'CGA': 'Arg', 'CAG': 'Gln', 'UCU': 'Ser', 'GAU': 'Asp', 'CGG': 'Arg', 'UUU': 'Phe', 'UGC': 'Cys', 'GGG': 'Gly', 'UGA':'---', 'GGA': 'Gly', 'UAA': '---', 'ACG': 'Thr', 'UAC': 'Tyr', 'UUC': 'Phe', 'UCG': 'Ser', 'UUA': 'Leu', 'UUG': 'Leu', 'UCC': 'Ser', 'ACC': 'Thr', 'UCA': 'Ser', 'GCA': 'Ala', 'GUA': 'Val', 'GCC': 'Ala', 'GUC': 'Val', 'GGC':'Gly', 'GCG': 'Ala', 'GUG': 'Val', 'GAG': 'Glu', 'GUU': 'Val', 'GCU': 'Ala', 'GAC': 'Asp', 'CGU': 'Arg', 'GAA': 'Glu', 'AUG': 'Met', 'CGC': 'Arg'}

    for i in range(len(my_dict)):


Solution 1:[1]

You can iterate through the dictionary like this:

Python 3

my_dict.items()

Python 2

my_dict.iteritems()

I coded this simple function that prints "noup" and the amino acid chain if it stops the synthesis. Let me know if it helps.

def amino_acids(mRNA):
for chain, amino in mRNA.items():
    if amino == "---":
        print(chain + " noup")

As you can see, the for-loop is taking two iterators: "chain" (which is the key) and amino, (the value). We are saying that if the value is "---", then the chain is a stop coddon.

Solution 2:[2]

def amino_acids(mrna): # your code here

my_dict = {'CUU': 'Leu', 'UAG': '---', 'ACA': 'Thr', 'AAA': 'Lys', 'AUC': 'Ile',
         'AAC': 'Asn','AUA': 'Ile', 'AGG': 'Arg', 'CCU': 'Pro', 'ACU': 'Thr', 
         'AGC': 'Ser','AAG': 'Lys', 'AGA': 'Arg', 'CAU': 'His', 'AAU': 'Asn',
         'AUU': 'Ile','CUG': 'Leu', 'CUA': 'Leu', 'CUC': 'Leu', 'CAC': 'His', 
         'UGG': 'Trp','CAA': 'Gln', 'AGU': 'Ser', 'CCA': 'Pro', 'CCG': 'Pro',
         'CCC': 'Pro', 'UAU': 'Tyr', 'GGU': 'Gly', 'UGU': 'Cys', 'CGA': 'Arg', 
         'CAG': 'Gln', 'UCU': 'Ser', 'GAU': 'Asp', 'CGG': 'Arg', 'UUU': 'Phe', 
         'UGC': 'Cys', 'GGG': 'Gly', 'UGA':'---', 'GGA': 'Gly', 'UAA': '---', 
         'ACG': 'Thr', 'UAC': 'Tyr', 'UUC': 'Phe', 'UCG': 'Ser', 'UUA': 'Leu', 
         'UUG': 'Leu', 'UCC': 'Ser', 'ACC': 'Thr', 'UCA': 'Ser', 'GCA': 'Ala', 
         'GUA': 'Val', 'GCC': 'Ala', 'GUC': 'Val', 'GGC':'Gly', 'GCG': 'Ala', 
         'GUG': 'Val', 'GAG': 'Glu', 'GUU': 'Val', 'GCU': 'Ala', 'GAC': 'Asp', 
         'CGU': 'Arg', 'GAA': 'Glu', 'AUG': 'Met', 'CGC': 'Arg'}

split_string = [mrna[i:i+3] for i in range(0, len(mrna), 3)]
amino_stopcodon = []
amino = []
for i in split_string:
    for key, value in my_dict.items():
        if i==key:
            amino_stopcodon.append(value)

for index, k in enumerate(amino_stopcodon):
    if k == "---":
        cut_index = index
amino = tuple(amino_stopcodon[:cut_index])

unique_items = []
count = 0
for item in amino:
    if item not in unique_items:
        count += 1
        unique_items.append(item)

amino_acid_list = [amino, count]
return amino_acid_list

END FUNCTION

Solution 3:[3]

def amino_acids(mrna):
    protein = ""
    translation = {'CUU': 'Leu', 'UAG': '---', 'ACA': 'Thr', 'AAA': 'Lys', 'AUC': 'Ile',
 'AAC': 'Asn','AUA': 'Ile', 'AGG': 'Arg', 'CCU': 'Pro', 'ACU': 'Thr', 
 'AGC': 'Ser','AAG': 'Lys', 'AGA': 'Arg', 'CAU': 'His', 'AAU': 'Asn',
 'AUU': 'Ile','CUG': 'Leu', 'CUA': 'Leu', 'CUC': 'Leu', 'CAC': 'His', 
 'UGG': 'Trp','CAA': 'Gln', 'AGU': 'Ser', 'CCA': 'Pro', 'CCG': 'Pro',
 'CCC': 'Pro', 'UAU': 'Tyr', 'GGU': 'Gly', 'UGU': 'Cys', 'CGA': 'Arg', 
 'CAG': 'Gln', 'UCU': 'Ser', 'GAU': 'Asp', 'CGG': 'Arg', 'UUU': 'Phe', 
 'UGC': 'Cys', 'GGG': 'Gly', 'UGA':'---', 'GGA': 'Gly', 'UAA': '---', 
 'ACG': 'Thr', 'UAC': 'Tyr', 'UUC': 'Phe', 'UCG': 'Ser', 'UUA': 'Leu', 
 'UUG': 'Leu', 'UCC': 'Ser', 'ACC': 'Thr', 'UCA': 'Ser', 'GCA': 'Ala', 
 'GUA': 'Val', 'GCC': 'Ala', 'GUC': 'Val', 'GGC':'Gly', 'GCG': 'Ala', 
 'GUG': 'Val', 'GAG': 'Glu', 'GUU': 'Val', 'GCU': 'Ala', 'GAC': 'Asp', 
 'CGU': 'Arg', 'GAA': 'Glu', 'AUG': 'Met', 'CGC': 'Arg'}
    stop_codons = {"UGA","UAG","UAA"}
    while mrna:
        codon = mrna[:3]  
        mrna = mrna[3:]  
        if codon in stop_codons:
            break 
        amino_acid = translation[codon]
        protein = (amino_acid.split((':')))
    return protein

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martín Schere
Solution 2 Sammy Waiyaki
Solution 3 Shaido