'Reverse complement of DNA strand using Python

I have a DNA sequence and would like to get reverse complement of it using Python. It is in one of the columns of a CSV file and I'd like to write the reverse complement to another column in the same file. The tricky part is, there are a few cells with something other than A, T, G and C. I was able to get reverse complement with this piece of code:

def complement(seq):
    complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'} 
    bases = list(seq) 
    bases = [complement[base] for base in bases] 
    return ''.join(bases)
    def reverse_complement(s):
        return complement(s[::-1])

    print "Reverse Complement:"
    print(reverse_complement("TCGGGCCC"))

However, when I try to find the item which is not present in the complement dictionary, using the code below, I just get the complement of the last base. It doesn't iterate. I'd like to know how I can fix it.

def complement(seq):
    complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'} 
    bases = list(seq) 
    for element in bases:
        if element not in complement:
            print element  
        letters = [complement[base] for base in element] 
        return ''.join(letters)
def reverse_complement(seq):
    return complement(seq[::-1])

print "Reverse Complement:"
print(reverse_complement("TCGGGCCCCX"))

Solution 1:^[1]

The other answers are perfectly fine, but if you plan to deal with real DNA sequences I suggest using Biopython. What if you encounter a character like "-", "*" or indefinitions? What if you want to do further manipulations of your sequences? Do you want to create a parser for each file format out there?

The code you ask for is as easy as:

from Bio.Seq import Seq

seq = Seq("TCGGGCCC")

print seq.reverse_complement()
# GGGCCCGA

Now if you want to do another transformations:

print seq.complement()
print seq.transcribe()
print seq.translate()

Outputs

AGCCCGGG
UCGGGCCC
SG

And if you run into strange chars, no need to keep adding code to your program. Biopython deals with it:

seq = Seq("TCGGGCCCX")
print seq.reverse_complement()
# XGGGCCCGA

Solution 2:^[2]

In general, a generator expression is simpler than the original code and avoids creating extra list objects. If there can be multiple-character insertions go with the other answers.

complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
seq = "TCGGGCCC"
reverse_complement = "".join(complement.get(base, base) for base in reversed(seq))

Solution 3:^[3]

import string
old_chars = "ACGT"
replace_chars = "TGCA"
tab = string.maketrans(old_chars,replace_chars)
print "AAAACCCGGT".translate(tab)[::-1]

that will give you the reverse compliment = ACCGGGTTTT

Solution 4:^[4]

The fastest one liner for reverse complement is the following:

def rev_compl(st):
    nn = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
    return "".join(nn[n] for n in reversed(st))

Solution 5:^[5]

def ReverseComplement(Pattern):
    revcomp = []
    x = len(Pattern)
    for i in Pattern:
        x = x - 1
        revcomp.append(Pattern[x])
    return ''.join(revcomp)

# this if for the compliment 

def compliment(Nucleotide):
    comp = []
    for i in Nucleotide:
        if i == "T":
            comp.append("A")
        if i == "A":
            comp.append("T")
        if i == "G":
            comp.append("C")
        if i == "C":
            comp.append("G")

    return ''.join(comp)

Solution 6:^[6]

Give a try to below code,

complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
seq = "TCGGGCCC"
reverse_complement = "".join(complement.get(base, base) for base in reversed(seq))

Solution 7:^[7]

Considering also degenerate bases:

def rev_compl(seq):
    BASES ='NRWSMBDACGTHVKSWY'
    return ''.join([BASES[-j] for j in [BASES.find(i) for i in seq][::-1]])

Solution 8:^[8]

This may be the quickest way to complete a reverse compliment:

def complement(seq):
    complementary = { 'A':'T', 'T':'A', 'G':'C','C':'G' }
    return ''.join(reversed([complementary[i] for i in seq]))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Trenton McKinney
Solution 2
Solution 3	Nathan M
Solution 4	alphahmed
Solution 5	niksy
Solution 6	Kiran Maniya
Solution 7
Solution 8	Get_Richie

'Reverse complement of DNA strand using Python

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Solution 4:[4]

Solution 5:[5]

Solution 6:[6]

Solution 7:[7]

Solution 8:[8]