'How to subset a pandas dataframe based on column names of another dataframe that may be in random order?
I want to subset the row names of the raw_clin dataframe by the column names of the common dataframe.
common dataframe example
common = pd.DataFrame([["PPP1R15A", -0.5880, 1.3980, -0.9402, -0.3741], ["AVPR1A", 1.5472, -0.8588, -0.1703, -0.5198], ["RGR", -0.3225, 0.8372, 0.2006, -0.0271]], columns=['Hugo_Symbol', 'TCGA-02-0010-01', 'TCGA-41-2571-01', 'TCGA-14-1821-01', 'TCGA-32-2632-01'])
raw_clin dataframe example
raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4'])
desired output
raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4'])
My attempt yielded no matches:
raw_clin = raw_clin[raw_clin.index.isin(common.columns)]
Solution 1:[1]
If I understand correct, you are mentioning row name which is index, then you need to use set_index for that dataframe.
Then your code will work raw_clin = raw_clin[raw_clin.index.isin(common.columns)] to create your desired output.
raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4']).set_index('PATIENT_ID')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Vignesh |
