'Different ways to subsetting specific columns and rows to extract specific information from the data
In my data, I have to extract metadata (mice sex, age, phase etc.) info from within the data. The metadata info is present in the first two columns of the first 38 rows. There is a version being used in the lab, which the primary coder in the lab wrote years ago, and unfortunately he does not work in the same lab anymore. As an intermediate in R, I am learning the structure behind his codes and just get better in R. However, I am finding it hard to understand this particular code ("metaDataRange") below from the primary coders version.
#Primary coders version-What really is this code doing?
metaDataRange = c(1: as.numeric(colnames(Trial)[2]))
#My version-gives the same output- What is the purpose of
#"as.numeric" in the coders version above?
metaDataRange = c(1:colnames(Trial)[2])
The following codes below are just extracting metadata info from the data
# this is telling that the meta data info is in
# the first 2 columns of metaDataRange.
meta <- Trial[metaDataRange, c(1,2)]
# My version - Gives the same output as the primary
# coders version above - What is the purpose of "metaDataRange"
# when "meta" can extract info directly from the "Trials"?
meta <- Trial[ c(1:38), c(1,2)]
#Identify the sex
sex<- meta[ grepl("Sex", meta[,1]) , 2]
#Identify the phase
phase<- meta[ grepl("Phase", meta[,1]) , 2]
#Identify the State of fasting
State<- meta[ grepl("State", meta[,1]) ,2]
#Get the mouse ID
mouse <- meta[ 33 , 2]
The output for "metaDataRange" for both my version and the coders version is :
> metaDataRange
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
I am not understanding what is the purpose of the "metaDataRange" in the primary coders version. It just returns a vector of numerics from 1 to 42, which I am assuming is the number of rows? But the [2] in the metaDataRange code looks like subsetting column 2 to me. Am I right? If so, then there's only 38 columns not 42. I am confused if 42 is the row length or column length. Also, I am not understanding why is the "as.numeric" in the metaDataRange important? My version of metaDataRange in the code above gives the same output. Also, as mentioned in the #comments above, "meta" can extract the metadata info from the data-file (Trial) directly, therefore, why is "metaDataRange" even necessary? So, in summary, my main confusion is the purpose of "metaDataRange". One important info to mention here is that the codes I have shown are nested in For loops. I wonder if that has to do something with the "metaDataRange" codes since in some data files the metaData info is not always there in the first 38 rows but in the range of 38-41 rows. Any answers will be very helpful!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
