'Altivec vec_all_gt equivalent on arm neon
I am porting an application from Altivec to Neon.
I see a lot of intrinsics in Altivec which return scalar values.
Do we have any such intrinsics on ARM ?
For instance vec_all_gt
Solution 1:[1]
There are no intrinsics that give scalar comparison results. This is because the common pattern for SIMD comparisons is to use branchless lane-masking and conditional selects to multiplex results, not branch-based control flow.
You can build them if you need them though ...
// Do a comparison of e.g. two vectors of floats
uint32x4_t compare = vcgeq_f32(a, b)
// Shift all compares down to a single bit in the LSB of each lane, other bits zero
uint32x4_t tmp = vshrq_n_u32(a.m, 31);
// Shift compare results up so lane 0 = bit 0, lane 1 = bit 1, etc.
static const int shifta[4] { 0, 1, 2, 3 };
static const int32x4_t shift = vld1q_s32(shifta);
tmp = vshlq_u32(tmp, shift)
// Horizontal add across the vector to merge the result into a scalar
return vaddvq_u32();
... at which point you can define any() (mask is non-zero) and all() (mask is 0xF) comparisons if you need branchy logic.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
