'Apply log2 transformation only to numeric columns of a data.frame

I am trying to run a log2 transformation on my data set but I keep getting an error that says "non-numeric variable(s) in data frame". My data has row.names = 1 and header = TRUE and is of class data.frame()

I tried adding lappy(na.strings) but this does not fix the problem

Shared_DEGs <- cbind(UT.Degs_heatmap[2:11], MT.Degs_heatmap[2:11], HT.Degs_heatmap[2:11])
Shared_DEGs1 <- `row.names<-`(Shared_DEGs, (UT.Degs_heatmap[,1]))
MyData.INF.log2 <- log2(Shared_DEGs1)

The data should be log2 transformed as an output



Solution 1:[1]

Yet another way using base R's rapply, kindly using the data provided by @r2evans.

rapply(mydf, f = log2, classes = c("numeric", "integer"), how = "replace")
#       num      int chr  lgl
#1 1.651496 2.321928   A TRUE

Solution 2:[2]

I always recommend using 'tidyverse' to process data frame. Install it with install.packages('tidyverse')

library(tidyverse)
log2_transformed <- mutate_if(your_data, is.numeric, log2)

Solution 3:[3]

Do not try to run log2 (or other numeric computations) on a data.frame as a whole, instead you need to do it per column. Since we don't have your data, I'll generate something to fully demonstrate:

mydf <- data.frame(num = pi, int = 5L, chr = "A", lgl = TRUE, stringsAsFactors = FALSE)
mydf
#        num int chr  lgl
# 1 3.141593   5   A TRUE
isnum <- sapply(mydf, is.numeric)
isnum
#   num   int   chr   lgl 
#  TRUE  TRUE FALSE FALSE 
mydf[,isnum] <- lapply(mydf[,isnum], log2)
mydf
#        num      int chr  lgl
# 1 1.651496 2.321928   A TRUE

What I'm doing here:

  • isnum is the subset of columns that are numeric (integer or float). This logical indexing can be extended to include things like "nothing negative" or "no NAs", completely up to you.
  • mydf[,isnum] subsets the data to just those columns
  • lapply(mydf[,isnum], log2) runs the function log2 against each column of the sub-frame, each column individually; what is passed to log2 is a vector of numbers, not a data.frame as in your attempt
  • mydf[,isnum] <- lapply(...): normally, if we do mydf <- lapply(...), we will be storing a list, which overwrites your previously instance (losing non-number columns) and no longer a frame, so using the underlying R function [<- (assigns to a subset), we replace the components of the frame (a) preserving other columns, and (b) without losing the "class" of the parent frame.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 markus
Solution 2 lkq
Solution 3