'Calculate a 2D homogeneous perspective transformation matrix from 4 points in MATLAB

I've got coordinates of 4 points in 2D that form a rectangle and their coordinates after a perspective transformation has been applied.

enter image description here

The perspective transformation is calculated in homogeneous coordinates and defined by a 3x3 matrix M. If the matrix is not known, how can I calculate it from the given points?

The calculation for one point would be:

| M11 M12 M13 |   | P1.x |   | w*P1'.x |
| M21 M22 M23 | * | P1.y | = | w*P1'.y |
| M31 M32 M33 |   | 1    |   | w*1     |

To calculate all points simultaneously I write them together in one matrix A and analogously for the transformed points in a matrix B:

    | P1.x P2.x P3.x P4.x |
A = | P1.y P2.y P3.y P4.y |
    | 1    1    1    1    |

So the equation is M*A=B and this can be solved for M in MATLAB by M = B/A or M = (A'\B')'.

But it's not that easy. I know the coordinates of the points after transformation, but I don't know the exact B, because there is the factor w and it's not necessary 1 after a homogeneous transformation. Cause in homogeneous coordinates every multiple of a vector is the same point and I don't know which multiple I'll get.

To take account of these unknown factors I write the equation as M*A=B*W where W is a diagonal matrix with the factors w1...w4 for every point in B on the diagonal. So A and B are now completely known and I have to solve this equation for M and W.

If I could rearrange the equation into the form x*A=B or A*x=B where x would be something like M*W I could solve it and knowing the solution for M*W would maybe be enough already. However despite trying every possible rearrangement I didn't managed to do so. Until it hit me that encapsulating (M*W) would not be possible, since one is a 3x3 matrix and the other a 4x4 matrix. And here I'm stuck.

Also M*A=B*W does not have a single solution for M, because every multiple of M is the same transformation. Writing this as a system of linear equations one could simply fix one of the entries of M to get a single solution. Furthermore there might be inputs that have no solution for M at all, but let's not worry about this for now.

What I'm actually trying to achieve is some kind of vector graphics editing program where the user can drag the corners of a shape's bounding box to transform it, while internally the transformation matrix is calculated.

And actually I need this in JavaScript, but if I can't even solve this in MATLAB I'm completely stuck.



Solution 1:[1]

Should have been an easy question. So how do I get M*A=B*W into a solvable form? It's just matrix multiplications, so we can write this as a system of linear equations. You know like: M11*A11 + M12*A21 + M13*A31 = B11*W11 + B12*W21 + B13*W31 + B14*W41. And every system of linear equations can be written in the form Ax=b, or to avoid confusion with already used variables in my question: N*x=y. That's all.

An example according to my question: I generate some input data with a known M and W:

M = [
    1 2 3;
    4 5 6;
    7 8 1
];
A = [
    0 0 1 1;
    0 1 0 1;
    1 1 1 1
];
W = [
    4 0 0 0;
    0 3 0 0;
    0 0 2 0;
    0 0 0 1
];
B = M*A*(W^-1);

Then I forget about M and W. Meaning I now have 13 variables I'm looking to solve. I rewrite M*A=B*W into a system of linear equations, and from there into the form N*x=y. In N every column has the factors for one variable:

N = [
    A(1,1) A(2,1) A(3,1)      0      0      0      0      0      0 -B(1,1)       0       0       0;
         0      0      0 A(1,1) A(2,1) A(3,1)      0      0      0 -B(2,1)       0       0       0;
         0      0      0      0      0      0 A(1,1) A(2,1) A(3,1) -B(3,1)       0       0       0;
    A(1,2) A(2,2) A(3,2)      0      0      0      0      0      0       0 -B(1,2)       0       0;
         0      0      0 A(1,2) A(2,2) A(3,2)      0      0      0       0 -B(2,2)       0       0;
         0      0      0      0      0      0 A(1,2) A(2,2) A(3,2)       0 -B(3,2)       0       0;
    A(1,3) A(2,3) A(3,3)      0      0      0      0      0      0       0       0 -B(1,3)       0;
         0      0      0 A(1,3) A(2,3) A(3,3)      0      0      0       0       0 -B(2,3)       0;
         0      0      0      0      0      0 A(1,3) A(2,3) A(3,3)       0       0 -B(3,3)       0;
    A(1,4) A(2,4) A(3,4)      0      0      0      0      0      0       0       0       0 -B(1,4);
         0      0      0 A(1,4) A(2,4) A(3,4)      0      0      0       0       0       0 -B(2,4);
         0      0      0      0      0      0 A(1,4) A(2,4) A(3,4)       0       0       0 -B(3,4);
         0      0      0      0      0      0      0      0      1       0       0       0       0
];

And y is:

y = [ 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 1 ];

Notice the equation described by the last row in N whose solution is 1 according to y. That's what I mentioned in my question, you have to fix one of the entries of M to get a single solution. (We can do this because every multiple of M is the same transformation.) And with this equation I'm saying M33 should be 1.

We solve this for x:

x = N\y

and get:

x = [ 1.00000; 2.00000; 3.00000; 4.00000; 5.00000; 6.00000; 7.00000; 8.00000; 1.00000; 4.00000; 3.00000; 2.00000; 1.00000 ]

which are the solutions for [ M11, M12, M13, M21, M22, M23, M31, M32, M33, w1, w2, w3, w4 ]

W is not needed after M has been calculated. For a generic point (x, y), the corresponding w is calculated while solving x' and y'.

| M11 M12 M13 |   | x |   | w * x' |
| M21 M22 M23 | * | y | = | w * y' |
| M31 M32 M33 |   | 1 |   | w * 1  |

When solving this in JavaScript I could use the Numeric JavaScript library which has the needed function solve to solve Ax=b.

Solution 2:[2]

OpenCV has a neat function that does this called getPerspectiveTransform. The source code for this function is available on github with this description:

/* Calculates coefficients of perspective transformation
 * which maps (xi,yi) to (ui,vi), (i=1,2,3,4):
 *
 *      c00*xi + c01*yi + c02
 * ui = ---------------------
 *      c20*xi + c21*yi + c22
 *
 *      c10*xi + c11*yi + c12
 * vi = ---------------------
 *      c20*xi + c21*yi + c22
 *
 * Coefficients are calculated by solving linear system:
 * / x0 y0  1  0  0  0 -x0*u0 -y0*u0 \ /c00\ /u0\
 * | x1 y1  1  0  0  0 -x1*u1 -y1*u1 | |c01| |u1|
 * | x2 y2  1  0  0  0 -x2*u2 -y2*u2 | |c02| |u2|
 * | x3 y3  1  0  0  0 -x3*u3 -y3*u3 |.|c10|=|u3|,
 * |  0  0  0 x0 y0  1 -x0*v0 -y0*v0 | |c11| |v0|
 * |  0  0  0 x1 y1  1 -x1*v1 -y1*v1 | |c12| |v1|
 * |  0  0  0 x2 y2  1 -x2*v2 -y2*v2 | |c20| |v2|
 * \  0  0  0 x3 y3  1 -x3*v3 -y3*v3 / \c21/ \v3/
 *
 * where:
 *   cij - matrix coefficients, c22 = 1
 */

This system of equations is smaller as it avoids solving for W and M33 (called c22 by OpenCV). So how does it work? The linear system can be recreated by the following steps:

Start with the equation for one point:

| c00 c01 c02 |   | xi |   | w*ui |
| c10 c11 c12 | * | yi | = | w*vi |
| c20 c21 c22 |   | 1  |   | w*1  |

Convert this to a system of equations, solve ui and vi, and eliminate w. You get the formulas for projection transformation:

     c00*xi + c01*yi + c02
ui = ---------------------
     c20*xi + c21*yi + c22

     c10*xi + c11*yi + c12
vi = ---------------------
     c20*xi + c21*yi + c22

Multiply both sides with the denominator:

(c20*xi + c21*yi + c22) * ui = c00*xi + c01*yi + c02
(c20*xi + c21*yi + c22) * vi = c10*xi + c11*yi + c12

Distribute ui and vi:

c20*xi*ui + c21*yi*ui + c22*ui = c00*xi + c01*yi + c02
c20*xi*vi + c21*yi*vi + c22*vi = c10*xi + c11*yi + c12

Assume c22 = 1:

c20*xi*ui + c21*yi*ui + ui = c00*xi + c01*yi + c02
c20*xi*vi + c21*yi*vi + vi = c10*xi + c11*yi + c12

Collect all cij on the left hand side:

c00*xi + c01*yi + c02 - c20*xi*ui - c21*yi*ui = ui
c10*xi + c11*yi + c12 - c20*xi*vi - c21*yi*vi = vi

And finally convert to matrix form for four pairs of points:

/ x0 y0  1  0  0  0 -x0*u0 -y0*u0 \ /c00\ /u0\
| x1 y1  1  0  0  0 -x1*u1 -y1*u1 | |c01| |u1|
| x2 y2  1  0  0  0 -x2*u2 -y2*u2 | |c02| |u2|
| x3 y3  1  0  0  0 -x3*u3 -y3*u3 |.|c10|=|u3|
|  0  0  0 x0 y0  1 -x0*v0 -y0*v0 | |c11| |v0|
|  0  0  0 x1 y1  1 -x1*v1 -y1*v1 | |c12| |v1|
|  0  0  0 x2 y2  1 -x2*v2 -y2*v2 | |c20| |v2|
\  0  0  0 x3 y3  1 -x3*v3 -y3*v3 / \c21/ \v3/

This is now in the form of Ax=b and the solution can be obtained with x = A\b. Remember that c22 = 1.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 VLL
Solution 2 VLL