'Background subtraction for Luminex data

I have a large data table similar to this:

library(data.table)

mydata <- fread(
  "PID,Stim,Analyte1,Analyte2,Analyte3
  123, SA, 5678, 6578, 4893
  123, UNS, 56, 45, 67
  123, LPS, 4467, 4356, 8573
  234, SA, 6657, 4455, 6589
  234, UNS, 89, 35, 67
  234, LPS, 4578, 7856, 9845
  567, SA, 7888, 5637, 6784
  567, UNS, 85, 67, 87
  567, LPS, 5673, 4834, 4893
")

I would like to perform background subtraction for each PID (patient ID), based on the Stim column. I would like to subtract the unstim from each stim condition (SA and LPS) per patient ID for each analyte. I'm fairly new to r, thus no example of what I've tried before.

What would be the best way to perform background subtraction when dealing with a large data table?



Solution 1:[1]

You could join subsets of the data.table:

 mydata[Stim=='UNS'][mydata[Stim!='UNS'],on=.(PID)]

     PID   Stim Analyte1 Analyte2 Analyte3 i.Stim i.Analyte1 i.Analyte2 i.Analyte3
   <int> <char>    <int>    <int>    <int> <char>      <int>      <int>      <int>
1:   123    UNS       56       45       67     SA       5678       6578       4893
2:   123    UNS       56       45       67    LPS       4467       4356       8573
3:   234    UNS       89       35       67     SA       6657       4455       6589
4:   234    UNS       89       35       67    LPS       4578       7856       9845
5:   567    UNS       85       67       87     SA       7888       5637       6784
6:   567    UNS       85       67       87    LPS       5673       4834       4893

Allowing you to substract analytes:

mydata[Stim=='UNS'][mydata[Stim!='UNS'],.(PID, Stim = i.Stim, 
                                          Sub1 = i.Analyte1 - Analyte1,
                                          Sub2 = i.Analyte2 - Analyte2,
                                          Sub3 = i.Analyte3 - Analyte3),
                    on=.(PID)]

     PID   Stim  Sub1  Sub2  Sub3
   <int> <char> <int> <int> <int>
1:   123     SA  5622  6533  4826
2:   123    LPS  4411  4311  8506
3:   234     SA  6568  4420  6522
4:   234    LPS  4489  7821  9778
5:   567     SA  7803  5570  6697
6:   567    LPS  5588  4767  4806

To generalize over more analytes, you could use expression evaluation :

nbAnalytes <- 3
expr<- paste("list(PID, Stim = i.Stim,",paste(sapply(1:nbAnalytes,function(i) paste0("Sub", i, "= i.Analyte",i," - Analyte",i)),collapse=','),")")
expr
# [1] "list(PID, Stim = i.Stim, Sub1= i.Analyte1 - Analyte1,Sub2= i.Analyte2 - Analyte2,Sub3= i.Analyte3 - Analyte3 )"


dt[,eval(rlang::parse_expr(expr))]

     PID   Stim  Sub1  Sub2  Sub3
   <int> <char> <int> <int> <int>
1:   123     SA  5622  6533  4826
2:   123    LPS  4411  4311  8506
3:   234     SA  6568  4420  6522
4:   234    LPS  4489  7821  9778
5:   567     SA  7803  5570  6697
6:   567    LPS  5588  4767  4806

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1