'Pandas->Styler: Hide specific cells when two or more rows have cells with the same value
I have in the following data frame three rows describing the same person, but with different phones:
data = {'Name':['tom', 'tom', 'tom', 'nick', 'krish', 'jack'],
'Age':[20, 20, 20, 21, 19, 18],
'Phone':[1234, 2345, 4576, 7890, 6767, 7676]}
df = pd.DataFrame(data)
In [16]: df
Out[16]:
Name Age Phone
0 tom 20 1234
1 tom 20 2345
2 tom 20 4576
3 nick 21 7890
4 krish 19 6767
5 jack 18 7676
I would like to generate html in which the style is defined to hide duplicated cells in the following matching rows leaving only the difference:
table_output
Name Age Phone
tom 20 1234
2345
4576
nick 21 7890
krish 19 6767
jack 18 7676
How can I:
- identify duplicate values (I am using the following PSB but perhaps there is a better way)
- Take those and hide them in Styler object similar to the table_output above?
In [22]: df.duplicated("Name")
Out[22]:
0 False
1 True
2 True
3 False
4 False
5 False
dtype: bool
In [23]: df.duplicated("Age")
Out[23]:
0 False
1 True
2 True
3 False
4 False
5 False
dtype: bool
I've managed to create
df_dup = df[df["Name"].duplicated()]
df_dup.style.hide_columns([0,1])
But I couldnt intersect df with df_dup -> style wise..
Thanks.
Solution 1:[1]
With the following dataframe in a Jupyter notebook:
import pandas as pd
df = pd.DataFrame(
data={
"Name": ["tom", "tom", "tom", "nick", "krish", "jack"],
"Age": [20, 20, 20, 21, 19, 18],
"Phone": [1234, 2345, 4576, 7890, 6767, 7676],
}
)
df
Output
You can do this:
def mask_values(val):
return f"opacity: {0}"
df.style.applymap(
mask_values,
subset=(
df[df.duplicated(subset=["Name", "Age"], keep="first")].index,
["Name", "Age"],
),
)
Output
You can check that the dataframe beneath is unchanged:
df.loc[0, "Name"] # Output: 'tom'
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Laurent |


