'Grep columns by values in different dataframe in python

I have this df1:

CHR   SNP          Pos  Ref Min
1   rs3094315   113934  A   G
1   rs12124819  126070  A   G
1   rs28765502  135853  C   T
1   rs9419478   158202  C   T
1   rs4881551   159076  G   A

and this df2:

CHR     SNP     A1  A2  MAF     NCHROBS
1   rs3094315   G   A   0.1402  214
1   rs12124819  G   A   0.1887  212
1   rs28765502  C   T   0.3113  212
1   rs7419119   G   T   0.2243  214
1   rs950122    C   G   0.1944  216

The first three rows have the same SNP names, so what I want to do is something like merge d2 with df1 based on coincidence in "SNP" and if they match evaluate if "A1" in df2 and "Ref" in df1 are the same, if not just Change the position letters and then subtract 1-MAF value as this at the end:

CHR     SNP     A1  A2  MAF     NCHROBS
1   rs3094315   A   G   0.8598  214
1   rs12124819  A   G   0.8113  212
1   rs28765502  C   T   0.3113  212
1   rs7419119   G   T   0.2243  214
1   rs950122    C   G   0.1944  216

What I tried is

import pandas as pd
import numpy as pd
df3=pd.merge(df2,df1, on=['SNP'])
df3['subtract']=np.where(df3['A1']!=df3['Ref']###print the subtract result 1-MAF= in 'subtract col')

But I don't want to lose those values that don't match in merge and print the subtract result of 1-MAF's value.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source