'pandas compare two data frames and highlight the differences

I'm trying to compare 2 dataframes and highlight the differences in the second one like this:

DFs

I have tried using concat and drop duplicates but I am not sure how to check for the specific cells and also how to highlight them at the end



Solution 1:[1]

Possible solution is the following:

import pandas as pd

# set test data
data1 = {"A": [10, 11, 23, 44], "B": [22, 23, 56, 55], "C": [31, 21, 34, 66], "D": [25, 45, 21, 45]}
data2 = {"A": [10, 11, 23, 44, 56, 23], "B": [44, 223, 56, 55, 73, 56], "C": [31, 21, 45, 66, 22, 22], "D": [25, 45, 26, 45, 34, 12]}

# create dtaframes
df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)

enter image description here

# define function to highlight differences in dataframes
def highlight_diff(data, other, color='yellow'):
    attr = 'background-color: {}'.format(color)
    return pd.DataFrame(np.where(data.ne(other), attr, ''),
                        index=data.index, columns=data.columns)

# apply style using function
df2.style.apply(highlight_diff, axis=None, other=df1)

Returns

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 gremur