'Use dask for an out of core conversion of iterable.product into a numpy/dask array (create a matrix of every permutation with repetition)
I am looking to create a matrix (numpy array of numpy arrays) of every permutation with repetition (I want to use it for matrix multiplication later on). Currently the way I am doing it, I first create a list of lists then use itertools and then convert to a numpy array of numpy arrays. However as R, the length of each permutation increases the size of the numpy array exponentially increases and causes a memory error. So, I want to generate a the matrix in dask instead. I went through the dask tutorials but haven't worked out how to do this yet.
For example every 5 number combination of the numbers from -1 to 1 (inclusive) using a step size of 0.1 (r = 5, n = 21):
# Create 5 lists each with 21 elements
lst = []
for i in range(0,5):
lst.append(np.linspace(-1,1,21).tolist())
lst
# Convert to a list of tuples, each tuple is a permutation e.g. -1,-1,-1,-1,-1 or -1,-1,-1,-1,-0.9
lst = list(itertools.product(*lst))
# Convert to a numpy array of numpy arrays for matrix multiplication later on
mat = np.array(lst)
Creating permutations of length 5 is already the maximum my laptop can handle given I am using N = 21. But I already get a memory error when trying to do a length of 6.
I've tried creating a function and using dask delay in together with list comprehension and also dask.array.from_array(), but I am still really new to dask and haven't found the solution yet.
Ideally I would be able to increase the length of the permutations (R) from 5 to somewhere around 10-20 (using the same N = 21 or decreasing it all the way to N = 5), anything above that would be awesome to have but not necessary.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
