'Should the use of the INTENT keyword speed the code up?
This question is based on an answer to the post Fortran intent(inout) versus omitting intent, namely the one by user Vladimyr, @Vladimyr.
He says that "<...> Fortran copies that data into a contiguous section of memory, and passes the new address to the routine. Upon returning, the data is copied back into its original location. By specifying INTENT, the compiler can know to skip one of the copying operations."
I did not know this at all, I thought Fortran passes by reference exactly as C. The first question is, why would Fortran do so, what is the rationale behind this choice?
As a second point, I put this behaviour to the test. If I understood correctly, use of INTENT(IN) would save the time spent in copying back the data to th original location, as the compiler is sure the data has not been changed.
I tried this little piece of code
function funco(inp) result(j)
!! integer, dimension(:), intent (in) :: inp
integer, dimension(:):: inp
integer, dimension(SIZE(inp)) :: j ! output
j = 0.0 !! clear whole vector
N = size(inp)
DO i = 1, N
j(i) = inp(i)
END DO
end function
program main
implicit none
interface
function funco(inp) result(j)
!! integer, dimension(:), intent (in) :: inp
integer, dimension(:) :: inp
integer, dimension(SIZE(inp)) :: j ! output
end function
end interface
integer, dimension(3000) :: inp , j
!! integer, dimension(3000) :: funco
integer :: cr, cm , c1, c2, m
real :: rate, t1, t2
! Initialize the system_clock
CALL system_clock(count_rate=cr)
CALL system_clock(count_max=cm)
CALL CPU_TIME(t1)
rate = REAL(cr)
WRITE(*,*) "system_clock rate ",rate
inp = 2
DO m = 1,1000000
j = funco(inp) + 1
END DO
CALL SYSTEM_CLOCK(c2)
CALL CPU_TIME(t2)
WRITE(*,*) "system_clock : ",(c2 - c1)/rate
WRITE(*,*) "cpu_time : ",(t2-t1)
end program
The function copies an array, and in the main body this is repeated many times.
According to the claim above, the time spent in copying back the array should somehow show up.
system_clock rate 1000.00000
system_clock : 2068.07910
cpu_time : 9.70935345
but the results are pretty much the same independently from whether INTENT is use or not.
Could anybody share some light on these two points, why does Fortran performs an additional copy (which seems ineffective at first, efficiency-wise) instead of passing by reference, and does really INTENT save the time of a copying operation?
Solution 1:[1]
The answer you are referring to speaks about passing some specific type of subsection, not of the whole array. In that case a temporary copy might be necessary, depending on the function. Your function uses and assumed shape array and a temporary array will not be necessary even if you try quite hard.
An example of what the author (it wasn't me) might have had in mind is
module functions
implicit none
contains
function fun(a, n) result(res)
real :: res
! note the explicit shape !!!
integer, intent(in) :: n
real, intent(in) :: a(n, n)
integer :: i, j
do j = 1, n
do i = 1, n
res = res + a(i,j) *i + j
end do
end do
end function
end module
program main
use functions
implicit none
real, allocatable :: array(:,:)
real :: x, t1, t2
integer :: fulln
fulln = 400
allocate(array(1:fulln,1:fulln))
call random_number(array)
call cpu_time(t1)
x = fun(array(::2,::2),(fulln/2))
call cpu_time(t2)
print *,x
print *, t2-t1
end program
This program is somewhat faster with intent(in) when compared to intent(inout) in Gfortran (not so much in Intel). However, it is even much faster with an assumed shape array a(:,:). Then no copy is performed.
I am also getting some strange uninitialized accesses in gfortran when running without runtime checks. I do not understand why.
Of course this is a contrived example and there are real cases in production programs where array copies are made and then intent(in) can make a difference.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
