'Get specific values from multiple columns through pandas

I have 8 populations in VCF files: populations

I just want to extract AD and DP values from NEN_001,NEN_003 NEN_200,NEN_300 and LAB_004 columns. For example: first AD value is 23,2 and first DP value is 25.

I have made the function::

def extract_AD(info):
    AD= int((info.split(':')[1]).split(',')[0])
    return AD


pop1 = file[["FORMAT","NEN_001","NEN_003","NEN_200","NEN_300","LAB_004","LAB_300","LAB_400","LAB_500"]]

tst1pop1 = pd.DataFrame(pop1)

AD= tst1pop1["NEN_001"].apply(extract_AD)

but this function only works for single column as shown "NEN_001"

How can i extract my desired values from multiple columns?



Solution 1:[1]

You could use the applymap method and slice all the columns you want.

cols_to_apply = ["NEN_001","NEN_003","NEN_200","NEN_300","LAB_004","LAB_300","LAB_400","LAB_500"]
AD= tst1pop1[cols_to_apply].applymap(extract_AD)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pelelter