'What is the most efficient way of changing Cartesian topology to one that supports MKL Cluster FFT?

I have a 3-dimensional array of size = [Nx, Ny, Nz] currently distributed among nprocs = nprocs_y * nprocs_z processes as subarrays of local_size = [Nx, Ny/nprocs_y, Nz/nprocs_z] with the data stored in column-major (Fortran) order.

I wish to Fourier transform this data concurrently. However, according to Intel's documentation on MKL Cluster FFT, the distribution of data has to be such that local_size_new = [Nx, Ny, Nz/nprocs]. The documentation does not seem to suggest that the cluster FFT technology can work with arbitrary topologies.

This forces me to attempt a redistribution of data according to the topology supported by the cluster FFT functions provided by Intel. Could you please suggest some ideas as to how this could be done most efficiently? Thank you.

c mpi fft intel-mkl

Solution 1:^[1]

Order of FFT dimensions is the same as the order of array dimensions in the programming language. For example, a 3-dimensional FFT with Lengths=(m,n,l) can be computed over an array Ar[m][n][l]. You could redistribute the data across the processes as per your task requirement. Please find the below link for details regarding Distributing Data among Processes. https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/fourier-transform-functions/cluster-fft-functions/distributing-data-among-processes.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Shanmukh-Intel

'What is the most efficient way of changing Cartesian topology to one that supports MKL Cluster FFT?

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]