'Python plotly treemap / sunburst with too many items
Sometimes, you have a lot of data at the same level, and you can't display all of it. One solution is to do like in a pie chart and put only the top 75-90%, and then a sector the size of the rest with the label 'other'.
Could we take advantage of the tree representation so the "other" portion is clickable, expandable and explorable?
The idea should be to have in the bottom right of the following sample diagram a big box that says "Rest" or "Other" or something like that, instead of a lot of little boxes.
The following graph was created using a dataframe and plotly express

It should end up being something like this (image made with GIMP)

Solution 1:[1]
I made a function that creates the data to pass to go.Treemap based on https://plotly.com/python/treemaps/#treemap-chart-with-a-continuous-colorscale
In my implementation, Other is not a leaf but a container as well.
def _build_hierarchical_dataframe(df, levels, value_column):
def _value(df, k=1):
return df.groupby(levels[:k])[value_column].sum()
def _build_id(dfg, i):
if i > 0:
return 'total/' + dfg[levels[:i]].agg('/'.join, axis=1)
else:
return 'total'
top, rest = _top_rest_daos(df_values, 0.15)
df_trees = []
for i, level in enumerate(group_path):
df_tree = pd.DataFrame(columns=['id', 'label', 'parent', 'value'])
dfg = _value(top, i+1).reset_index()
df_tree['label'] = dfg[level].copy()
df_tree['parent'] = _build_id(dfg, i)
df_tree['id'] = _build_id(dfg, i+1)
if i == len(group_path) - 1:
df_tree['value'] = dfg[value_column]
else:
df_tree['value'] = 0
df_trees.append(df_tree)
# Now to append the 'Rest' value (symbols parents)
df_tree = pd.DataFrame(columns=['id', 'label', 'parent', 'value'])
dfg = rest.groupby(['network'])[value_columnF].sum().reset_index()
df_tree['parent'] = _build_id(dfg, 1)
df_tree['id'] = df_tree['parent'] + '/other'
df_tree['label'] = 'Others'
df_tree['value'] = 0
df_trees.append(df_tree)
# Now to append the SYMBOLS into the 'Rest'
df_tree = pd.DataFrame(columns=['id', 'label', 'parent', 'value', 'color', 'customData'])
dfg = rest.groupby(['network', 'symbol'])[value_column].sum().reset_index()
df_tree['parent'] = _build_id(dfg, 1) + '/other'
df_tree['id'] = df_tree['parent'] + '/' + dfg['symbol']
df_tree['label'] = dfg['symbol']
df_tree['value'] = dfg[value_column]
df_trees.append(df_tree)
df_trees.append(pd.DataFrame({
'id': ['total'],
'label': ['All Networks'],
'parent': [''],
'value': [0]
}))
return pd.concat(df_trees, ignore_index=True, axis=0)
Example usage:
hdf = _build_hierarchical_dataframe(df_values.reset_index(), group_path, value_col)
fig = go.Figure(go.Treemap(
ids=hdf['id'],
labels=hdf['label'],
parents=hdf['parent'],
values=hdf['value']
))
fig.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | David Davó |

