'How to avoid poorly optimized code for Fortran array pointers?
I get rather poorly optimized assembly with gfortran 11.2 when using array pointers, even in seemingly very simple cases. For example, the optimized code (https://godbolt.org/z/eWo1oWzW3) for this function
subroutine test_ptr(x, y, n)
implicit none
integer, intent(in) :: n
integer, target, intent(in) :: x(n)
integer, target, intent(out) :: y(n)
integer, dimension(:), pointer, contiguous :: x_ptr, y_ptr
x_ptr => x
y_ptr => y
y_ptr = x_ptr
endsubroutine
uses malloc
for temporary storage in the assignment y_ptr = x_ptr
, on any optimization level. Instead, I was hoping for basically the same assembly as the few lines (https://godbolt.org/z/rqsno6cjK) that are generated for the analogous version without array pointers:
subroutine test(x, y, n)
implicit none
integer, intent(in) :: n
integer, intent(in) :: x(n)
integer, intent(out) :: y(n)
y = x
endsubroutine
To me it seems like the compiler should be able to see that x_ptr
and y_ptr
are both trivial pointers to a range of contiguous memory, such that no temporary is necessary. But as can be seen in the original pass of the GCC Tree/RTL view (https://godbolt.org/z/Tajaezffs), a temporary is allocated and no optimization pass seems to be able to get rid of it.
A workaround I found is to call test(x_ptr, y_ptr, n)
from test_ptr
instead of using =
directly (https://godbolt.org/z/YTx9hr9E3). That way, no malloc
is generated in the original pass and the generated code is basically the same as without any array pointers.
Of course this all looks pretty silly in these toy examples, why would you even use array pointers in this case? But I actually encounter pretty much this setting when having to interop with C. To work with the data from the C world in Fortran, I have to call C_F_POINTER
, for which I need an array pointer. So I have functions that look a bit like this
subroutine test_c(x_cptr, y_cptr, n)
use, intrinsic :: iso_c_binding
implicit none
integer(c_size_t), intent(in) :: n
type(c_ptr), value :: x_cptr
type(c_ptr), value :: y_cptr
integer(c_int), dimension(:), pointer, contiguous :: x, y
call c_f_pointer(x_cptr, x, [n])
call c_f_pointer(y_cptr, y, [n])
y = x
endsubroutine
and the generated code is basically the same as in the test_ptr
case above (https://godbolt.org/z/zK9vG3zMG). As I have to work with a lot of C data, the mentioned workaround to write a subroutine
that takes plain arrays for any operation that needs to be performed on the data is not really feasible for me. Therefore, I wonder whether there is a simpler way to help the compiler optimize away these array pointers.
Edit: I have since learned that this is indeed an aliasing issue combined with using the assigment operator for arrays. This comment in the gfortran source makes that pretty clear. An easy way to detect these cases (among others) is the warning option -Warray-temporaries
. Unfortunately I do not see a way to convey that two array pointers may not alias, short of hacking gfc_could_be_alias
to always return 0
(which does remove the temporary in this case, but that really isn't a solution).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|