'Optimization defeats robustness measures
I am currently investigating robust methods for the summation of arrays, and implemented the algorithm published by Shewchuk in "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates". While the implemented algorithm works as expected in gfortran, ifort optimizes the countermeasures away.
To give some context, here is my code:
module test_mod
contains
function shewchukSum( array ) result(res)
implicit none
real,intent(in) :: array(:)
real :: res
integer :: xIdx, yIdx, i, nPartials
real :: partials(100), hi, lo, x, y
nPartials = 0
do xIdx=1,size(array)
i = 0
x = array(xIdx)
! Calculate the partial sums
do yIdx=1,nPartials
y = partials(yIdx)
hi = x + y
if ( abs(x) < abs(y) ) then
lo = x - (hi - y)
else
lo = y - (hi - x)
endif
x = hi
! If a round-off error occured, store it. Exact comparison intended
if ( lo == 0. ) cycle
i = i + 1 ; partials(i) = lo
enddo ! yIdx
nPartials = i + 1 ; partials( nPartials ) = x
enddo ! xIdx
res = sum( partials(:nPartials) )
end function
end module
And the calling test program is
program test
use test_mod
implicit none
print *, sum([1.e0, 1.e16, 1.e0, -1.e16])
print *,shewchukSum([1.e0, 1.e16, 1.e0, -1.e16])
end program
Compilation using gfortran with produces the correct results for all optimization levels:
./a.out
0.00000000
2.00000000
ifort, however, produces zeros for all optimizations above -O0:
./a.out
0.00000000
0.00000000
I tried to debug the code and went down to the assembly level and figured out that ifort is optimizing away the calculation of lo and the operations after if ( lo == 0. ) cycle .
Is there a possibility to force ifort to perform the complete operation for all levels of optimization? This addition is a critical part of the calculations, and I want it to run as fast as possible.
For comparison, gfortran at -O2 executes this code approximately eight to ten times faster than ifort at -O0 (measured for arrays of length >100k).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
