'Openmp do I have false sharing or Race condition?
I'm learning Openmp, and I'm working on compressed sparse row multiplication (datatype std::complex<int>). And I'm getting different execution time each time I run the following function:
typedef std::vector < std::vector < std::complex < int >>> matrix;
struct CSR {
std::vector<std::complex<int>> values; //non-zero values
std::vector<int> row_ptr; //pointers of rows
std::vector<int> cols_index; //indices of columns
int rows; //number of rows
int cols; //number of columns
int NNZ; //number of non_zero elements
};
const matrix multiply_omp (const CSR& A,
const CSR& B) {
if (A.cols != B.rows)
throw "Error";
CSR B_t = sparse_transpose(B);
matrix result(A.rows, std::vector < std::complex < int >>(B.rows, 0));
#pragma omp parallel
{
#pragma omp for
for (int i = 0; i < A.rows; i++) {
for (int j = A.row_ptr[i]; j < A.row_ptr[i + 1]; j++) {
int Ai = A.cols_index[j];
std::complex<int> Avalue = A.values[j];
for (int k = 0; k < B_t.rows; k++) {
std::complex < int > sum(0, 0);
for (int l = B_t.row_ptr[k]; l < B_t.row_ptr[k + 1]; l++)
if (Ai == B_t.cols_index[l]) {
sum += Avalue * B_t.values[l];
break;
}
if (sum != std::complex < int >(0, 0)) {
result[i][k] += sum;
}
}
}
}
}
return result;
}
I set a for loop to call the function 10 iterations giving it 1000*1000 matrices and used omp_get_wtime(), and here is the result:
iteration 1 : 0.751642 s
iteration 2 : 0.911264 s
iteration 3 : 1.553695 s
iteration 4 : 0.761839 s
iteration 5 : 0.603688 s
iteration 6 : 0.423919 s
iteration 7 : 0.423114 s
iteration 8 : 0.445878 s
iteration 9 : 0.892305 s
iteration 10 : 0.918682 s
is that normal? or do I have false sharing or Race condition?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
