'How to use a 3d array (image-like) as an index-map to slice a 4d array (video) (a generalization of the rolling shutter effect)
I've been fooling around lately with taking the webcam's video steam and giving it a pixel-dependent time delay.
A very simple example for that idea is the famous rolling shutter, but when applied in order of seconds instead of 1/100ths, it looks like this https://youtu.be/mQ0hS7l9ckY
Now, rolling shutter is fun and all, but I want something more general. I want a delay map, a (height, width, 3) shaped array that tells my how far back to go in the video. A pseudo-code for this would be
output_image[y, x, c] = video_cache[delay_map[y,x,c], y, x, c]
where the first index of the video cache is time, y,x are self-explanatory, and c is the color channel (BGR because open cv is weird).
In essence, each pixel of the output is a pixel of the video at the same position, but at a time determined by the delay map at the very same position.
Here's the solution I have now: I flattened everything, I access the video cache similar to how you unravel multi-index nonsense, and once I'm done I reshape the result into an image.
This solution works pretty fast, and I'm pretty proud of it. It almost keeps up with my webcam's frame rate (I think I average on 20 of these per second).
I think the flattening and reshaping of each frame costs me some time, and if I could get rid of those I'd get much better results.
Link to the whole file at the bottom.
Here's a skeleton of my implementation.
I have a class called CircularCacheDelayAccess. It stores a cache of video frames (with given number of frames, called cache_size in my implementation). It enables you to store frames, and get the delay-mapped frame.
Instead of pushing all the frames around each time I store a new one, I keep an index that goes around in a circle, and video[delay=3] would be found via something like cache[index-3]. Thanks to python's funny negative index tricks, I don't even have to get the positive modulo.
The delay_map is actually a float array; when I use circ_cache.getFrame I input the integer part of delay_map.flatten(), and then I use the fractional part to interpolate between frames.
class CircularCacheDelayAccess:
def __init__(self, img_shape: tuple, cache_size: int):
self.image_shape = img_shape
self.cache_size = cache_size
# some useful stuff
self.multi_index_shape = (cache_size,) + img_shape
self.image_size = int(np.prod(img_shape))
self.size = cache_size * self.image_size
# the index, going around in circles
self.cache_index = 0
self.cache = np.empty(self.size)
# raveled_image_indices is a running index over a frame; it is the same thing as writing
# y, x, c = np.mgrid[0:height, 0:width, 0:3]
# raveled_image_indices = c + 3 * (x + width * y)
# but it's a lot easier
self.raveled_image_indices = np.arange(self.image_size)
def store(self, image: np.ndarray):
# (in my implementation I check that the shape matches and raise a ValueError if it does not)
self.cache_index = (self.cache_index + 1) % self.cache_size
# since cache holds entire image frames, the start of each frame is index * image size
cIndex = self.image_size * self.cache_index
self.cache[cIndex: cIndex + self.image_size] = image.flatten()
def getFrame(self, delay_map: np.ndarray):
# delay_map may either have shape == self.image_shape, or shape = (self.image_size,)
# (more asserts, for the shape of delay_map, and to check its values do not exceed the cache size)
# (if delay_map.shape == image_shape, I flatten it. If we were already given a flattened version,
# there's no need to do so)
frame = self.cache[self.image_size * (self.cache_index - delay_map) + self.raveled_image_indices]\
.reshape(self.image_shape)
return frame
As I've already stated, this works pretty good, but I think I could get it to work better if I could just side-step the flatten and reshape steps.
Also, keeping a flattened version of an array that makes sense in its full-shaped form is pretty awkward.
And, I've mentioned the interpolation part. It felt wrong to do that in CircularCacheDelayAccess, but doing the interpolation after I getFrame twice means I need the fractional part of delay_map to be in the full-shaped form, and I need the int part flattened, which is pretty silly.\
Here are some fun examples which would probably be pretty hard to understand without seeing the video, but are still fun to look at. It looks even better with a face, but I don't think I should show my face here, so sorry about that: horizontal rolling shutter, color delay psychedelia, my weirdest effect so far
And here is a link to the entire code, with capture and stuff if you wanna mess around with it and read the entire code.
Thanks in advance!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
