'Interpolate position given three images

Let's say I got three subsequent frames from a dashcam video of a car driving on a straight line. Suppose the positions of the first and last frames are known (e.g. because a GPS signal was available there).

Now I would like to estimate the position of the middle image, given the positions of the images taken before and afterwards. How do I do this?

In literature, this problem is known as "visual localization". However, I could not find any work on exactly this or a similar problem. Most approaches are way more advanced and use point clouds, which I would like to avoid. Can someone maybe point me to literature about this topic?

My current approach is to detect SIFT features, calculate the fundamental matrix and use singular value composition to get the translation vector. However, the results are very unstable. I thought that this is a comparably easy problem, given that the car movement does not result in rotations.

opencv computer-vision

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Interpolate position given three images

Sources

Related Questions