'Using Parallel Coordinate Plot (e.g. HiPlot) with lazy loaded Big Data frame (e.g. Dask)?
I have a large dataframe that I am using Daks to work with. Its size is 27,000,000 rows by 13 columns. I was trying to use HiPlot on this dask dataframe as below:
hip.Experiment.from_dataframe(a).display()
But it gives the error: AttributeError: 'DataFrame' object has no attribute 'to_dict'. When I copy a part of it as a pandas dataframe (say 200,000 rows) it works fine. I guess it is about the way that Dask Dataframe is different from pandas, which is a lazy loading. I am wondering if there is a way around this?
Thank you very much for your answer
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
