'python- How do i remove a rows from pandas dataframe by 2 columns value (The values should be a combination of 2 strings )?

I have data frame like this for example:

      Owner   dogs     cats
0       Tim   Snow     Loki
1  Jennifer  Daisy    Bella
2       Bob   Snow  Charlie
3     Alice  Oscar   Willow
4       Leo   Snow     Loki

I have json file:

{ 
"owner": [
{
  "dog": "Snow",
  "cat": "Loki",
},
{
  "dog": "Oscar",
  "cat": "Willow",
}]
}

What is the best way to delete the lines that have both a dog and a cat that appear in the json file.

I need to remove the row by 2 columns values that I take from the json file.

One can also suggest another way to keep the cat and dog combination that if they show up together in a row I want to delete that row from df.

thanks.



Solution 1:[1]

You can do:

owner_json = {
      "owner": [
      {
         "dog": "Snow",
         "cat": "Loki",
      },
      { 
         "dog": "Oscar",
         "cat": "Willow",
      }]
}
owner_df = pd.json_normalize(owner_json['owner'])
owner_df = owner_df.set_index(['dog', 'cat'])
owner_df.index.names = ['dogs', 'cats']


df = pd.DataFrame({
    'Owner':['Tim', 'Jennifer', 'Bob', 'Alice', 'Leo'],
    'dogs':['Snow','Daisy','Snow','Oscar','Snow'],
    'cats':['Loki','Bella','Charlie','Willow','Loki']
})
df = df.set_index(['dogs', 'cats'])
df = df.loc[df.index.difference(owner_df.index)].copy()
df.reset_index(inplace=True)

Output:

    dogs       cats    Owner
0   Daisy     Bella Jennifer
1    Snow   Charlie      Bob

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 SomeDude