'How to subset a pandas dataframe based on column names of another dataframe that may be in random order?

I want to subset the row names of the raw_clin dataframe by the column names of the common dataframe.

common dataframe example

common = pd.DataFrame([["PPP1R15A", -0.5880, 1.3980, -0.9402, -0.3741], ["AVPR1A", 1.5472, -0.8588, -0.1703, -0.5198], ["RGR", -0.3225, 0.8372, 0.2006, -0.0271]], columns=['Hugo_Symbol', 'TCGA-02-0010-01', 'TCGA-41-2571-01', 'TCGA-14-1821-01', 'TCGA-32-2632-01'])

raw_clin dataframe example

raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4'])

desired output

raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4'])

My attempt yielded no matches:

raw_clin = raw_clin[raw_clin.index.isin(common.columns)]


Solution 1:[1]

If I understand correct, you are mentioning row name which is index, then you need to use set_index for that dataframe.

Then your code will work raw_clin = raw_clin[raw_clin.index.isin(common.columns)] to create your desired output.

raw_clin = pd.DataFrame([["TCGA-02-0010-01", "I", "want", "to", "subset"], ["TCGA-14-1821-01", "clin_var", "rownames", "by", "common"], ["TCGA-41-2571-01", "colnames", "where", "the", "latter"], ["TCGA-32-2632-01", "may", "be", "random", "order"]], columns=['PATIENT_ID', 'Something1', 'something2', 'something3', 'something4']).set_index('PATIENT_ID')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Vignesh