'Edit only specific lines when I find special character with awk
I have this kind of file :
>AX-89948491-minus
CTAACACATTTAGTAGATT
>AX-89940152-plus
cgtcattcagggcaggtggggcaaaA
>AX-89922107-plus
TTATAACTTGTGTATGCTCTCAGGCT
When the lines start by ">" and include "minus" , I need to reverse (rev) and translate (tr) the next following lines. I should get :
>AX-89948491-minus
AATCTACTAAATGTGTTAG
>AX-89940152-plus
cgtcattcagggcaggtggggcaaaA
>AX-89922107-plus
TTATAACTTGTGTATGCTCTCAGGCT
I would like to go with awk. I tried that but it does not work..
awk '{if(NR%2==1~/"plus"/){print;getline;print} else if (NR%2==1~/"minus"/){system("echo "$0" | rev | tr ATCGatcg TAGCtagc")} else {print;getline;print}}' file
Any help?
Solution 1:[1]
This gnu-awk should work for you:
awk '
p {
cmd = "rev <<< \047" $0 "\047 | tr ATCGatcg TAGCtagc"
if ((cmd |& getline var) > 0)
$0 = var
}
{
p = /^>/ && /-minus/
} 1' file
>AX-89948491-minus
AATCTACTAAATGTGTTAG
>AX-89940152-plus
cgtcattcagggcaggtggggcaaaA
>AX-89922107-plus
TTATAACTTGTGTATGCTCTCAGGCT
Solution 2:[2]
Awk is a tool to manipulate text, not a tool to sequence calls to other tools. The latter is what a shell is for. There are times when you need to call other tools from awk but not when it's simple text manipulation like reversing and translating characters in a string as you want to do.
Using any awk in any shell on every Unix box without spawning a subshell once per target input line to call other Unix tools (including the non-POSIX-defined rev which won't exist on some Unix boxes):
$ cat tst.awk
BEGIN {
split("ATCGatcg TAGCtagc",tmp)
for (i=1; i<=length(tmp[1]); i++) {
tr[substr(tmp[1],i,1)] = substr(tmp[2],i,1)
}
}
f {
out = ""
for (i=1; i<=length($0); i++) {
char = substr($0,i,1)
out = (char in tr ? tr[char] : char) out
}
$0 = out
f = 0
}
/^>.*minus/ { f=1 }
{ print }
$ awk -f tst.awk file
>AX-89948491-minus
AATCTACTAAATGTGTTAG
>AX-89940152-plus
cgtcattcagggcaggtggggcaaaA
>AX-89922107-plus
TTATAACTTGTGTATGCTCTCAGGCT
Solution 3:[3]
I'd use perl, as it has builtin reverse and tr functions:
perl -lpe '
if (/^>/) {$rev = /minus/; next}
if ($rev) {$_ = reverse; tr/ATCGatcg/TAGCtagc/}
' file
>AX-89948491-minus
AATCTACTAAATGTGTTAG
>AX-89940152-plus
cgtcattcagggcaggtggggcaaaA
>AX-89922107-plus
TTATAACTTGTGTATGCTCTCAGGCT
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | glenn jackman |
