1. Projective Geometry Problems

(a) Prove that parallel lines in the world reference system are still parallel in the camera reference system.

pf) Define a line in the CRS(camera reference system) to be 3-dimensional plane that pass through $A’_1=[x’_1, y’_1, 1]$ and $A’_2=[x’_2, y’_2, 1]$. Then the parallel planes are defined to be 2 planes such that $\overrightarrow{A_1 A_2}$ and $\overrightarrow{B_1 B_2}$ are parallel. In WRS case, parallel can be defined in exactly the same way. That is,

\( (A^1_i-A^2_i)_{i}=k(B^1_i-B^2_i)_{i}, i=1,2,3 \)

for some real $k$.

Performing projective transformation yields the following.

\( K\begin{bmatrix}R&T\end{bmatrix}\begin{bmatrix}A^1\\ 1\end{bmatrix}=K[RA^1+T]\\ \overrightarrow{A’^1 A’^2}_{x,y} = K[RA^1-RA^2]_{x,y}=kKR[B^1-B^2]=\overrightarrow{B’^1 B’^2}_{x,y} \)

Notice that matrix linearity is preserved since translation term cancels out when taking subtraction.

Hence projected lines are parallel.

(b) Consider a unit square $pqrs$ in the world reference system where $p, q, r, s$ are points. Will the same square in the camera reference system always have unit area? Prove or provide a counterexample.

pf) Simply consider a case where camera is too much skewed so that $\theta=0$. In that case, the area is always 0.

(c) Now let’s consider affine transformations, which are any transformations that preserve parallelism. Affine transformations include not only rotations and translations, but also scaling and shearing. Given some vector p, an affine transformation is defined as $A(p) = Mp + b$ where $M$ is an invertible matrix. Prove that under any affine transformation, the ratio of parallel line segments is invariant, but the ratio of non-parallel line segments is not invariant.

pf) Without loss of generality, since we are considering only the ratio of line segments, we may assume that both lines start at origin $O$. Then $\overrightarrow{O P}$ and $\overrightarrow{O (rP)}$ are parallel and have ratio $r$.
We may also assume that $b=0$, since we are only considering ratio and translation term is canceled out when computing the length of the line segment.
Now the ratio after the affine map is easily computed since it is the ratio between $\overrightarrow{O (MP)}$ and $\overrightarrow{O (rMP)}$
For non-parallel case, consider 2-d affine map $A$ that is defined by $2\times 2$ matrix which acts on 2 unit vectors having orign as starting point and having $(1,0)$ and $(0,1)$ as endpoint whose ratio is 1. If

\( M=\begin{bmatrix}1&2\\ 3&5\end{bmatrix} \)

Then the output endpoint is $(1,3)$ and $(2,5)$ whose ratio is obviously not 1.

(d) You have explored whether these three properties hold for affine transformations. Do these properties hold under any projective transformation? Justify briefly in one or two sentences (no proof needed).

Affine transformation is the transformation on Euclidean cooridnate, and since this can be translated into projective transformation in projective coordinate, it seems that all properties hold for projective case.

2. Affine Camera Calibration

  • Affine camera는 요약하면 이미지 평면에 평행하게 투사되는 카메라 모델이며, 카메라 행렬 맨 아랫줄이 $(0,0,0,1)$인 행렬이다.
  • 좌표 계산에 나눗셈이 빠지게 되어 계산이 쉬워지며, 정확도를 약간 희생한다.
  • 수업에서 배운 weak perspective나 orthogonal projection 모델 등이 해당되며, intrinsic extrinsic도 다룰 수 있지만 어려우니 생략

1

  • scene coordinate system을 위와 같이 설정, 상자 크기는 50mm, 상자 간격은 30mm

(a) Given correspondences for the calibrating grid, solve for the camera parameters using Eq.2. Note that each measurement $(x_i, y_i) ↔ (X_i, Y_i, Z_i)$ yields two linear equations for the 8 unknown camera parameters. Given $N$ corner measurements, we have $2N$ equations and 8 unknowns. Using the given corner correspondences as inputs, complete the method $\textrm{compute_camera_matrix()}$. You will construct a linear system of equations and solve for the camera parameters to minimize the least-squares error. After doing so, you will return the 3 × 4 affine camera matrix composed of these computed camera parameters. In your written report, submit your code as well as the camera matrix that you compute.

We may have the following relationship. \( \begin{bmatrix}P^\top_i&0 \\ 0&P^\top_i\end{bmatrix}\begin{bmatrix}m_1^\top\\ m_2^\top\end{bmatrix} = -\begin{bmatrix}u_i \\ v_i\end{bmatrix} \)

So we construct 48 equations into matrix and conduct SVD.

For the rest, refer to colab notebook.

(b) After finding the calibrated camera matrix, you will compute the RMS error between the given N image corner coordinates and N corresponding calculated corner locations in $\textrm{rms_error()}$. Recall that $\text{RMS}_\text{total} =\sqrt{((x − x′)^2 + (y − y′)^2)/N}$ Please submit your code and the RMS error for the camera matrix that you found in part (a).

Refer to colab notebook.

(c) Could you calibrate the matrix with only one checkerboard image? Explain briefly in one or two sentences.

Yes, I can. The method above for calculating affine matrix does not depend on the number of images, but for accuracy, the more images are preferred.

3. Single View Geometry

2

(a) In Figure 2, we have identified a set of pixels to compute vanishing points in each image. Please complete $\textrm{compute_vanishing_point()}$, which takes in these two pairs of points on parallel lines to find the vanishing point. You can assume that the camera has zero skew and square pixels, with no distortion.

Refer to colab notebook.

(b) Using three vanishing points, we can compute the intrinsic camera matrix used to take the image. Do so in $\textrm{compute_K_from_vanishing_points()}$.

$v_1^\top\omega v_2=0$ can be translated into

\( \begin{bmatrix}x_1x_2+y_1y_2&x1+x_2&y_1+y_2&1\end{bmatrix}\begin{bmatrix}w1\\w4\\w5\\w6\end{bmatrix}=0 \)

If we assume $w_6=1$ here to obtain 3 equations and combining 3 equation into 1, we can make use of inverse matrix.

For the rest, refer to colab notebook.

(c) Is it possible to compute the camera intrinsic matrix for any set of vanishing points? Similarly, is three vanishing points the minimum required to compute the intrinsic camera matrix? Justify your answer.

No. If lines corresponding to vanishing points are not parallel, then at least we should know about the angles between them in order to compute the camera intrinsic matrix. 3 points are minimum because 3 points yields 3 equations, and $\omega$ matrix has DoF 4 under the 0 skewness and square pixel assumption and unique up to scale.

(d) The method used to obtain vanishing points is approximate and prone to noise. Discuss approaches to refine this process.

  1. Collect $N>3$ vanishing points that satisfies perpendicular condition between corresponding parallel lines.
  2. Construct the matrix $\omega$ that satisfies $v_i\omega v_j, i\neq j$. Then we have $\frac{N(N-1)}{2}$ equations. use this relationship to compute $\omega$ via using SVD.
  3. Compute $K$ via Cholesky decomposition.

(e) This process gives the camera internal matrix under the specified constraints. For the remainder of the computations, use the following internal camera matrix:

\(K =\begin{bmatrix}2448&0&1253\\0&2438&986\\0&0&1\end{bmatrix}\)

Identify a sufficient set of vanishing lines on the ground plane and the plane on which the letter A exists, written on the side of the cardboard box, (plane-A). Use these vanishing lines to verify numerically that the ground plane is orthogonal to the plane-A. Fill out the method $\textrm{compute_angle_between_planes()}$ and submit your code and the computed angle. Refer to colab notebook.

(f) Assume the camera rotates but no translation takes place. Assume the internal camera parameters remain unchanged. An Image 2 of the same scene is taken. Use vanishing points to estimate the rotation matrix between when the camera took Image 1 and Image 2. Fill out the method $\textrm{compute_rotation_matrix_between_cameras()}$ and submit your code and your results.

From the $v=Kd$ induction, we can notice that for rotated camera, $v=KRd$. This means that for image 2, every equation involving camera 2 should have camera matrix $K’=KR$. So in the code, I compute $K’K^{-1}$

For the rest, refer to colab notebook.