'python- How do i remove a rows from pandas dataframe by 2 columns value (The values should be a combination of 2 strings )?
I have data frame like this for example:
Owner dogs cats
0 Tim Snow Loki
1 Jennifer Daisy Bella
2 Bob Snow Charlie
3 Alice Oscar Willow
4 Leo Snow Loki
I have json file:
{
"owner": [
{
"dog": "Snow",
"cat": "Loki",
},
{
"dog": "Oscar",
"cat": "Willow",
}]
}
What is the best way to delete the lines that have both a dog and a cat that appear in the json file.
I need to remove the row by 2 columns values that I take from the json file.
One can also suggest another way to keep the cat and dog combination that if they show up together in a row I want to delete that row from df.
thanks.
Solution 1:[1]
You can do:
owner_json = {
"owner": [
{
"dog": "Snow",
"cat": "Loki",
},
{
"dog": "Oscar",
"cat": "Willow",
}]
}
owner_df = pd.json_normalize(owner_json['owner'])
owner_df = owner_df.set_index(['dog', 'cat'])
owner_df.index.names = ['dogs', 'cats']
df = pd.DataFrame({
'Owner':['Tim', 'Jennifer', 'Bob', 'Alice', 'Leo'],
'dogs':['Snow','Daisy','Snow','Oscar','Snow'],
'cats':['Loki','Bella','Charlie','Willow','Loki']
})
df = df.set_index(['dogs', 'cats'])
df = df.loc[df.index.difference(owner_df.index)].copy()
df.reset_index(inplace=True)
Output:
dogs cats Owner
0 Daisy Bella Jennifer
1 Snow Charlie Bob
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | SomeDude |