'Fastest way of resizing (rescaling) a 1D vector by an arbitrary factor
I have the following code that does the resizing of a 1D vector with nearest neighbor interpolation in a similar fashion you'd also resize an image. Another term would be resampling, but there seems to be a lot of confusion around these terms (resampling is also a technique in statistics), so I prefer to be more descriptive.
Currently the code looks like this and I need to optimize it:
inline void resizeNearestNeighbor(const int16_t* current, uint32_t currentSize, int16_t* out, uint32_t newSize, uint32_t offset = 0u)
{
if(currentSize == newSize)
{
return;
}
const float scaleFactor = static_cast<float>(currentSize) / static_cast<float>(newSize);
for(uint32_t outIdx = 0; outIdx<newSize; ++outIdx)
{
const int currentIdx = static_cast<uint32_t>(outIdx * scaleFactor);
out[outIdx] = current[(currentIdx + offset)%currentSize];
}
}
This of course is not hugely efficient because the operation to take the integer part of a float by downcasting is expensive and I don't think it can take any benefit of vectorization in this case. The platform is Cortex M7, so if you're familiar with any vectorization techniques on this platform, it would be also very helpful.
The use case of this code is a sound effect that allows for smoothly changing the length of a delay line (hence the additional offset parameter, since it's a ring buffer). Being able to smoothly change the length of a delay line sounds like slowing down or speeding up playback in a tape recorder, only it's in a loop. Without this scaling, there are lots of clicking noises and artifacts. Currently the hardware struggles with all the DSP and this code on top of that and it can't rescale long delay lines in real time.
Solution 1:[1]
If you look at currentIdx, you'll note that it is incremented by scaleFactor every time outIdx is incremented by one. Hence, you can replace outIdx * scaleFactor with currentIdx += scaleFactor.
You'd initialize currentIdx to offset, so that's hoisted from the loop as well.
%currentSize is an expensive operation as well, and one that appears to exist only for the non-zero offset case. You might want to treat that differently, and split the loop in two loops (before/after wrap-around point).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
