'Make a column with values borrowed from another column and negative sign from another

I would like to make a copy of a column in a new column but replace all instances of '0' with '1E+308' but borrow the '-' or '+' sign from another column. My data frame looks as follows

Gene    log2FoldChange  pvalue
MAN_08191   2.237973215 6.78E-307
MAN_10213   -4.701066739    7.95E-306
MAN_02441   -5.084348578    0
MAN_02858   -4.200656711    5.33E-304
MAN_03363   -2.580385454    0
MAN_01699   3.003051841 4.72E-303
MAN_02586   5.200458256 0

I would like to get the following new column "Rank"

Gene    log2FoldChange  pvalue  Rank
MAN_08191   2.237973215 6.78E-307   6.78E-307
MAN_10213   -4.701066739    7.95E-306   -7.95E-306
MAN_02441   -5.084348578    0   -1E+308
MAN_02858   -4.200656711    5.33E-304   -5.33E-304
MAN_03363   -2.580385454    0   -1E+308
MAN_01699   3.003051841 4.72E-303   4.72E-303
MAN_02586   5.200458256 0   1E+308


Solution 1:[1]

You can use sign()

library(dplyr)

dt %>% 
  mutate(rank = sign(log2FoldChange)*(if_else(pvalue==0,1E+308,pvalue)))

Output:


       Gene log2FoldChange    pvalue       rank
1 MAN_08191       2.237973 6.78e-307  6.78e-307
2 MAN_10213      -4.701067 7.95e-306 -7.95e-306
3 MAN_02441      -5.084349  0.00e+00 -1.00e+308
4 MAN_02858      -4.200657 5.33e-304 -5.33e-304
5 MAN_03363      -2.580385  0.00e+00 -1.00e+308
6 MAN_01699       3.003052 4.72e-303  4.72e-303
7 MAN_02586       5.200458  0.00e+00  1.00e+308

Solution 2:[2]

If awk is an option

$ awk 'NR == 1 { $( NF+1 ) = "Rank" } NR > 1 {split($2,a,"[0-9]"); ( $NF==0 ? $(NF+1)=a[1]"1E+308" : $( NF+1 ) = $NF) }1' input_file
Gene log2FoldChange pvalue Rank
MAN_08191 2.237973215 6.78E-307 6.78E-307
MAN_10213 -4.701066739 7.95E-306 7.95E-306
MAN_02441 -5.084348578 0 -1E+308
MAN_02858 -4.200656711 5.33E-304 5.33E-304
MAN_03363 -2.580385454 0 -1E+308
MAN_01699 3.003051841 4.72E-303 4.72E-303
MAN_02586 5.200458256 0 1E+308

Solution 3:[3]

awk '
    NR==1 {$(NF+1) = "rank"}
    NR!=1 {
      $(NF+1)=($2 ~ /-/ ? "-" : "")($NF == 0 ? "1E+308" : $NF)
    }1' input_file|column -s ' ' -t

Gene       log2FoldChange  pvalue     rank
MAN_08191  2.237973215     6.78E-307  6.78E-307
MAN_10213  -4.701066739    7.95E-306  -7.95E-306
MAN_02441  -5.084348578    0          -1E+308
MAN_02858  -4.200656711    5.33E-304  -5.33E-304
MAN_03363  -2.580385454    0          -1E+308
MAN_01699  3.003051841     4.72E-303  4.72E-303
MAN_02586  5.200458256     0          1E+308

Solution 4:[4]

INPUT

Gene    log2FoldChange  pvalue
MAN_08191   2.237973215 6.78E-307
MAN_10213   -4.701066739    7.95E-306
MAN_02441   -5.084348578    0
MAN_02858   -4.200656711    5.33E-304
MAN_03363   -2.580385454    0
MAN_01699   3.003051841 4.72E-303
MAN_02586   5.200458256 0

CODE

{m,g}awk 'sub("$","\t"( !_==NR ? "Rank"  \
                      : !+$+NF ? "1E+308" \
                      : /^[^ \t]+[ \t]+[-]/ ? "-" $NF : $NF))'

OUTPUT

Gene    log2FoldChange  pvalue  Rank
MAN_08191   2.237973215 6.78E-307   6.78E-307
MAN_10213   -4.701066739    7.95E-306   -7.95E-306
MAN_02441   -5.084348578    0   1E+308
MAN_02858   -4.200656711    5.33E-304   -5.33E-304
MAN_03363   -2.580385454    0   1E+308
MAN_01699   3.003051841 4.72E-303   4.72E-303
MAN_02586   5.200458256 0   1E+308

if you want it nicely formatted, then it's

CODE

gawk -e '$(_<_)=sprintf("%-15s %+20s %15s %15s", $(_^=_<_), $++_, $++_,
                       ( (+$_) * (($--_)~"^[-]") ? "-" : "")       \
                       (--_==NR ? "Rank" : !+$NF ? "1E+308" : $NF ) )'

OUTPUT

  Gene            log2FoldChange          pvalue            Rank
  MAN_08191          2.237973215       6.78E-307       6.78E-307
  MAN_10213         -4.701066739       7.95E-306      -7.95E-306
  MAN_02441         -5.084348578               0          1E+308
  MAN_02858         -4.200656711       5.33E-304      -5.33E-304
  MAN_03363         -2.580385454               0          1E+308
  MAN_01699          3.003051841       4.72E-303       4.72E-303
  MAN_02586          5.200458256               0          1E+308

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 HatLess
Solution 3
Solution 4 RARE Kpop Manifesto