'+ operator vs * operator in DESeq design to account for batch effects in unbalanced group RNAseq data

First post, extremely new to R. Please forgive my naivete, and any mistakes in making my question clear.

I have RNA seq data from whole blood from patients and healthy subjects, and unfortunately I made the mistake of sequencing the samples before understanding the dramatic problem of batch effects, made even worse with the unbalanced groupings.

Batch 1 total 41 samples= 19 healthy + 22 patients

Batch 2 total 23 samples = 2 healthy + 21 patients

As you can see, there is a very large discrepancy in the number of healthy in each batch I have tried including batch as a part of the design

   dds <- DESeqDataSetFromMatrix(countData = cts,  colData = coldata2,
                   design= ~Batch + Class) #i have used the term class to differentiate between patient and healthy

This approach yielded a suspiciously large number of DE genes (~20,000 of the ~25,000 total genes measured)

I read a related post suggesting to use (design = ~Batch * Class).

I tried that and got a more "reasonable" number of DE genes (~3000), but I do not understand how the use of the * operator affects the design. Can anyone explain or link something that can help me understand?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source