'Mixed AMD and Intel nodes in a cluster... considerations?
I am setting up a small, 256 core compute cluster at my university for fluid dynamics simulations. The code we use is written in a mix of C and Fortran and currently runs on a large supercomputer just fine.
In our cluster development, we have 16 compute nodes with 16 AMD CPUs each. We also have an 8 core Dell box that we would like to use as a "head" or "login" node. This box, however, is Intel Xenon.
We would like to NFS mount the home directory of each user to the login node and restrict their access to the compute nodes. This would require the users to compile and run their programs via mpirun on the login node. Our questions are:
- Is this possible with a mixed CPU system like this? Or would we run into problems with compiling on Intel and executing on AMD?
- If this is a problem, is there a work around? Could we somehow have the user transparently compile their code on a compute node while only logged into a login node?
- In a cluster with a head node, should only the home directory be shared via NFS mount? Or are there other directories which we should also share between compute and head node(s)?
If there's a good resource out there that could help, we'd appreciate that, too. We've found so many suggestions and ideas on various pages... It'd be nice to be pointed towards one that the community considers reputable. (Disclaimer... we aren't computer scientists, we are just regular scientists.)
Solution 1:[1]
I also have the same question. But coming to think of it heterogeneity is the norm. GPU is a different processor architecture compared to a GPU. But during cross-compilation of the program, exact target acrhitecture should be defined. Compiler will create binary exactly for the target architecture.
While compiling for GPU, I have seen compiler flags specifying the right arch options
For example:
/usr/local/cuda/bin/nvcc -ccbin /opt/anaconda3/bin/x86_64-conda_cos6-linux-gnu-gcc -I../../../Common -m64 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o deviceQuery.o -c deviceQuery.cpp
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jeremy Caney |
