'Filter/Join on larger than memory dask dataframe and obtaining results
I am trying to read multiple parquet files from disk using Dask (approximately 100GB) on a machine of 16GB RAM and perform filters or joins (results still larger than memory)
What is the way to obtain results without raising memory error? (Using dask.dataframe.DataFrame.compute() loads the whole result into memory which is not desirable)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
