'Creating Pandas DataFrame from the data points selected on the Plotly Scatter Plot
Plotly Figure Widget helps me create a scatter plot which is interactive, ie., I can select data points on the scatter plot and based on the selection my table widget shows the records. I wanted help with converting this table to a pandas dataframe.
import plotly.graph_objs as go
import plotly.offline as py
import pandas as pd
import numpy as np
from ipywidgets import interactive, HBox, VBox
py.init_notebook_mode()
df = pd.read_csv('https://raw.githubusercontent.com/jonmmease/plotly_ipywidget_notebooks/master/notebooks/data/cars/cars.csv')
f = go.FigureWidget([go.Scatter(y = df['City mpg'], x = df['City mpg'], mode = 'markers')])
scatter = f.data[0]
N = len(df)
scatter.x = scatter.x + np.random.rand(N)/10 *(df['City mpg'].max() - df['City mpg'].min())
scatter.y = scatter.y + np.random.rand(N)/10 *(df['City mpg'].max() - df['City mpg'].min())
scatter.marker.opacity = 0.5
def update_axes(xaxis, yaxis):
scatter = f.data[0]
scatter.x = df[xaxis]
scatter.y = df[yaxis]
with f.batch_update():
f.layout.xaxis.title = xaxis
f.layout.yaxis.title = yaxis
scatter.x = scatter.x + np.random.rand(N)/10 *(df[xaxis].max() - df[xaxis].min())
scatter.y = scatter.y + np.random.rand(N)/10 *(df[yaxis].max() - df[yaxis].min())
axis_dropdowns = interactive(update_axes, yaxis = df.select_dtypes('int64').columns, xaxis = df.select_dtypes('int64').columns)
# Create a table FigureWidget that updates on selection from points in the scatter plot of f
t = go.FigureWidget([go.Table(
header=dict(values=['ID','Classification','Driveline','Hybrid'],
fill = dict(color='#C2D4FF'),
align = ['left'] * 5),
cells=dict(values=[df[col] for col in ['ID','Classification','Driveline','Hybrid']],
fill = dict(color='#F5F8FF'),
align = ['left'] * 5))])
def selection_fn(trace,points,selector):
t.data[0].cells.values = [df.loc[points.point_inds][col] for col in ['ID','Classification','Driveline','Hybrid']]
scatter.on_selection(selection_fn)
# Put everything together
VBox((HBox(axis_dropdowns.children),f,t))
Just expecting the table created after selecting points on the scatter plot to a pandas dataframe.
Solution 1:[1]
Probably not the most elegant way to solve it, but after you select your points, you can type:
d = t.to_dict()
df = pd.DataFrame(d['data'][0]['cells']['values'], index =d['data'][0]['header']['values']).T
t is of type plotly.graph_objs._figurewidget.FigureWidget
I use jupyter notebook, so I wrote these lines of code one cell below your code, and I get a new df with the selected events
Solution 2:[2]
Assuming the following piece of code highlights the points you care about:
def selection_fn(trace,points,selector):
t.data[0].cells.values = [df.loc[points.point_inds][col] for col in ['ID','Classification','Driveline','Hybrid']]
Change it to return a dataframe:
def selection_fn(trace,points,selector):
return pd.df([df.loc[points.point_inds][col] for col in ['ID','Classification','Driveline','Hybrid'] if col in {selection}])
The list comprehension needs to be changed to loop over only the points you want to return. Example list comphrension from the documentation:
[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
Solution 3:[3]
A better solution:
def selection_fn(trace, points, selector):
t.data[0].cells.values = [
df.loc[points.point_inds][col]
for col in ["ID", "Classification", "Driveline", "Hybrid"]]
selection_fn.df1 = df.loc[points.point_inds]
print(selection_fn.df1)
Access a function variable outside the function without using "global"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | user88484 |
| Solution 2 | |
| Solution 3 | Atmani Saad |
