'MPI and MemoryLeaks and MPI_Wait() with async send and recv
I am new to MPI programming and I am trying to create a program that would perform 2-way communication between processes in a ring.
I was getting MemoryLeaks errors at the MPI_Finalize() statement. Later I found out that I could use the -fsanitize=address -fno-omit-frame-pointer flags to help me debug where the leaks could be.
Now I get a very bizarre (at least for me) error.
Here's my code:
MPI_Request request_s1, request_s2, request_r1, request_r2;
// receiving 2 elems from the left neighbor, which i shall be needing
if (0 > MPI_Irecv(lefties, EXTENT, MPI_DOUBLE, my_left, 1, MPI_COMM_WORLD, &request_r1)) {
return 2;
}
// receiving 2 elems from my right neighbor which i will be appending at the end of my input
if (0 > MPI_Irecv(righties, EXTENT, MPI_DOUBLE, my_right, 1, MPI_COMM_WORLD, &request_r2)) {
return 2;
}
// sending the first 2 elems which will be required by the left neighbor
if (0 > MPI_Isend(my_output_buffer, EXTENT, MPI_DOUBLE, my_left, 1, MPI_COMM_WORLD, &request_s1)) {
return 2;
}
// sending the last 2 elems to my right neighbor
if (0 > MPI_Isend(&my_output_buffer[displacement - EXTENT], EXTENT, MPI_DOUBLE, my_right, 1, MPI_COMM_WORLD, &request_s2)) {
return 2;
}
MPI_Wait(&request_r2, MPI_STATUS_IGNORE);
MPI_Wait(&request_r1, MPI_STATUS_IGNORE);
The error I get is
[my_machine:18353] *** An error occurred in MPI_Wait
[my_machine:18359] *** reported by process [204079105,1]
[my_machine:18359] *** on communicator MPI_COMM_WORLD
[my_machine:18359] *** MPI_ERR_TRUNCATE: message truncated
[my_machine:18359] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[my_machine:18359] *** and potentially your MPI job)
[my_machine:18353] 1 more process has sent help message help-mpi-btl-base.txt / btl:no-nics
[my_machine:18353] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
and I have no clue how to progress from here.
Solution 1:[1]
- You seem to be reusing your request variables. Don't. If one is created, you have to wait for it.
- It wouldn't hurt to initialize the request variables with
MPI_REQUEST_NULL, in case you're waiting for a request that was not created. - The
0>MPI_whateveridiom is strange. Instead:MPI_SUCCESS!=MPI_Whatever. - But even that may not be work because the default is that routines do not return on error, but abort the program.
- And it may be something else entirely which I can't tell without seeing the rest of the code.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Victor Eijkhout |
