'Data alignment inside a structure in Intel Fortran

I'm trying to align in memory the following type of data:

type foo
   real, allocatable, dimension(:) :: bar1, bar2
   !dir$ attributes align:64 :: bar1
   !dir$ attributes align:64 :: bar2
end type foo

type(foo), allocatable, dimension(:) :: my_foo
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))

! somewhere here I need to tell the compiler that data is aligned
!    for a simple array with name `bar` I would just do:
!dir$ assume_aligned bar1: 64
!dir$ assume_aligned bar2: 64
!    but what do I do for the data type I have, something like this?
!dir$ assume_aligned my_foo(1)%bar1: 64
!dir$ assume_aligned my_foo(1)%bar2: 64

do i = 1, 100
   my_foo(1)%bar1(i) = 10.
   my_foo(1)%bar2(i) = 10.
end do

As you can see, it's an array of foo type structures, that has two large arrays bar1 and bar2 as variables that I need to be aligned near cache boundaries in the memory.

I kind of know how to do that for simple arrays (link), but I have no idea how to do that for this sort of complex data structure. And what if my_foo wasn't of size 1, but was of size, say, 100? Do I loop through them?



Solution 1:[1]

Ok, case semi-closed. The solution turned out to be pretty straightforward. You just use pointers and do an assume_aligned to them. That should take care of it.

type foo
   real, allocatable, dimension(:) :: bar1, bar2
   !dir$ attributes align:64 :: bar1
   !dir$ attributes align:64 :: bar2
end type foo

type(foo), target, allocatable, dimension(:) :: my_foo
real, pointer, contiguous :: pt_bar1(:)
real, pointer, contiguous :: pt_bar2(:)
allocate(my_foo(1))
allocate(my_foo(1)%bar1(100))
allocate(my_foo(1)%bar2(100))

pt_bar1 = my_foo(1)%bar1
pt_bar2 = my_foo(1)%bar2
!dir$ assume_aligned pt_bar1:64, pt_bar2:64

pt_bar1 = 10.
pt_bar2 = 10.

do loops are still not vectorized smh. Like if I do the same thing like this

do i = 1, 100
   pt_bar1(i) = 10.
   pt_bar2(i) = 10.
end do

it won't be vectorized.

UPD. Ok, this does the job (also need to add -qopenmp-simd flag to the compiler):

!$omp simd
!dir$ vector aligned
do i = 1, 100
   pt_bar1(i) = 10.
   pt_bar2(i) = 10.
end do

Also if you're looping through my_foo(j)%... make sure to free the pointers after each iteration with pt_bar1 => null() etc.

PS. Thanks to BW from our department for this help. :) Sometimes personal communication > stackoverflow (not always, only sometimes).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1