'Array operations in Parallel

This is the problem: I have 2 arrays of the same dimension array A and B, and I am updating every element on Array A using array B. For simplicity, let us take "update" every element on array A using array B as just copying data from array B to array A.

What kind of parallelism should we invoke here? Vectorization level? Thread Level?

Based on the level of parallelism, should we use a CPU or GPU?

Is code written for CPU portable for GPU or vice versa? Suppose I use openmp, or boost, would that work for GPU's?

Would I need to tailor the arrays and variables to exploit parallelism and if so, how can I do this?

I know these are difficult questions. Large links and textual reference is great, and I would go through all of them. Thank you. Let us assume we are using a compiled language such as C. Suppose we have the freedom to build our own machine, so we can choose to use CPU or GPU as we please.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source