[SLAM][En] Errors and Jacobian Derivations for SLAM Part 1

1. Introduction

This post explains the definition of various errors used in SLAM and the Jacobian derivations for nonlinear optimization.

The errors covered in this post are:

Reprojection error $\begin{matrix} (1) & \begin{aligned} e & = p - \hat{p} \in R^{2} \end{aligned} \end{matrix}$
Photometric error $\begin{matrix} (2) & \begin{aligned} e & = I_{1} (p_{1}) - I_{2} (p_{2}) \in R^{1} \end{aligned} \end{matrix}$
Relative pose error (PGO) $\begin{matrix} (3) & \begin{array}{r} e_{i j} = Log (z_{i j}^{- 1} {\hat{z}}_{i j}) \in R^{6} \end{array} \end{matrix}$
Line reprojection error $\begin{matrix} (4) & \begin{array}{r} e_{l} = [\begin{array}{c} \frac{x_{s}^{⊺} l_{c}}{\sqrt{l_{1}^{2} + l_{2}^{2}}}, & \frac{x_{e}^{⊺} l_{c}}{\sqrt{l_{1}^{2} + l_{2}^{2}}} \end{array}] \in R^{2} \end{array} \end{matrix}$
IMU measurement error : TBA $\begin{matrix} (5) & \begin{array}{r} e_{B} = [\begin{array}{c} δ α_{b_{k + 1}}^{b_{k}} \\ δ θ_{b_{k + 1}}^{b_{k}} \\ δ β_{b_{k + 1}}^{b_{k}} \\ δ b_{a} \\ δ b_{g} \end{array}] = [\begin{array}{c} R_{w}^{b_{k}} (p_{b_{k + 1}}^{w} - p_{b_{k}}^{w} - v_{b_{k}}^{w} Δ t_{k} + \frac{1}{2} g^{w} Δ t_{k}^{2}) - {\hat{α}}_{b_{k + 1}}^{b_{k}} \\ 2 [({\hat{γ}}_{b_{k + 1}}^{b_{k}})^{- 1} \otimes (q_{b_{k}}^{w})^{- 1} \otimes q_{b_{k + 1}}^{w}]_{x y z} \\ R_{w}^{b_{k}} (v_{b_{k + 1}}^{w} - v_{b_{k}}^{w} + g^{w} Δ t_{k}) - {\hat{β}}_{b_{k + 1}}^{b_{k}} \\ b_{a_{k + 1}} - b_{a_{k}} \\ b_{g_{k + 1}} - b_{g_{k}} \end{array}] \end{array} \end{matrix}$

Different Jacobians are derived depending on whether the camera pose is expressed as a rotation matrix $R \in S O (3)$ or a transformation matrix $T \in S E (3)$ . To obtain the two Jacobians, we derive the Jacobian for SO(3) in case of reprojection error and the Jacobian for SE(3) in case of photometric error. Points in 3D space also have different Jacobians depending on how $X = [X, Y, Z, W]^{⊺}$ is expressed and inverse depth $ρ$ is expressed. . The Jacobian derivation process for both cases is also described.

The Jacobians covered in this post are:

Camera pose (SO(3)-based) $\begin{matrix} (6) & \begin{array}{r} \frac{\partial e}{\partial R} \to \frac{\partial e}{\partial Δ w} \end{array} \end{matrix}$
Camera pose (SE(3)-based) $\begin{matrix} (7) & \begin{array}{r} \frac{\partial e}{\partial T} \to \frac{\partial e}{\partial Δ ξ} \end{array} \end{matrix}$
Map point $\begin{matrix} (8) & \begin{array}{r} \frac{\partial e}{\partial X} \end{array} \end{matrix}$
Relative pose (SE(3)-based) $\begin{matrix} (9) & \begin{aligned} \frac{\partial e_{i j}}{\partial Δ ξ_{i}}, \frac{\partial e_{i j}}{\partial Δ ξ_{j}} \end{aligned} \end{matrix}$
3D plücker line $\begin{matrix} (10) & \begin{aligned} \frac{\partial e_{l}}{\partial l}, \frac{\partial l}{\partial L_{c}}, \frac{\partial L_{c}}{\partial L_{w}}, \frac{\partial L_{w}}{\partial δ_{θ}} \end{aligned} \end{matrix}$
Quaternion representation $\begin{matrix} (11) & \begin{aligned} \frac{\partial X^{'}}{\partial q} \end{aligned} \end{matrix}$
Camera intrinsics $\begin{matrix} (12) & \begin{aligned} \frac{\partial e}{\partial f_{x}}, \frac{\partial e}{\partial f_{y}}, \frac{\partial e}{\partial c_{x}}, \frac{\partial e}{\partial c_{y}} \end{aligned} \end{matrix}$
Inverse depth $\begin{matrix} (13) & \begin{array}{r} \frac{\partial e}{\partial ρ} \end{array} \end{matrix}$
IMU error-state system kinematics : TBA $\begin{matrix} (14) & \begin{array}{r} J_{b_{k + 1}}^{b_{k}}, P_{b_{k + 1}}^{b_{k}}, F, G \end{array} \end{matrix}$
IMU measurement : TBA $\begin{matrix} (15) & \begin{array}{r} \frac{\partial e_{B}}{\partial [p_{b_{k}}^{w}, q_{b_{k}}^{w}]}, \frac{\partial e_{B}}{\partial [v_{b_{k}}^{w}, b_{a_{k}}, b_{g_{k}}]}, \frac{\partial e_{B}}{\partial [p_{b_{k + 1}}^{w}, q_{b_{k + 1}}^{w}]}, \frac{\partial e_{B}}{\partial [v_{b_{k + 1}}^{w}, b_{a_{k + 1}}, b_{g_{k + 1}}]} \end{array} \end{matrix}$

2. Optimization formulation

2.1. Error derivation

In SLAM, error is defined as the difference between an observed value (measurement) $z$ based on sensor data and an estimated value $\hat{z}$ based on mathematical modeling.

$\begin{matrix} (16) & \begin{array}{r} e (x) = z - \hat{z} (x) \end{array} \end{matrix}$

- $x$ : state variable for the model

As above, the difference between the observed value and the predicted value is set as the error, and the optimal state variable $x$ that minimizes the error is calculated as an optimization problem in SLAM. At this time, since the state variables of SLAM in general include non-linear terms related to rotation, the non-linear least squares method is mainly used.

2.2. Error function derivation

In general, when a large amount of sensor data comes in, dozens to hundreds of errors are calculated in the form of vectors. At this time, it is assumed that the error follows a normal distribution, and the work of transforming it into an error function is performed.

$\begin{matrix} (17) & \begin{array}{r} e (x) = z - \hat{z} \sim N (\hat{z}, Σ) \end{array} \end{matrix}$

The multivariate normal distribution for modeling the error function is

$\begin{matrix} (18) & \begin{array}{r} p (x) = \frac{1}{\sqrt{(2 π)^{n} | Σ |}} \exp (- \frac{1}{2} (x - μ)^{⊺} Ω (x - μ)) \sim N (μ, Σ) \end{array} \end{matrix}$

- $Ω = Σ^{- 1}$ : information matrix (inverse of covariance matrix)

We can model the error as a multivariate normal distribution with mean $\hat{z}$ and variance $Σ$ . $\ln p (z)$ obtained by applying log-likelihood to the expression is as follows.

$\begin{matrix} (19) & \begin{aligned} \ln p (z) & \propto - \frac{1}{2} (z - \hat{z})^{T} Ω (z - \hat{z}) \\ \propto - \frac{1}{2} e^{⊺} Ω e \end{aligned} \end{matrix}$

If you find the $x^{*}$ where the log-likelihood $\ln p (z)$ is maximized, the probability of the multivariate normal distribution is maximized. This is called Maximum Liklihood Estimation (MLE). Since $\ln p (z)$ has a negative number (-) in front of it, if you find $\ln p (z)$ that minimizes the negative log-likelihood, you get:

$\begin{matrix} (20) & \begin{array}{r} x^{*} = \arg max p (z) = \arg min e^{T} Ω e \end{array} \end{matrix}$

When all errors are added together, instead of a single error, it is expressed as follows, and this is called the error function $E$ . In an actual optimization problem, we find $x^{*}$ that minimizes the error function $E$ rather than a single error $e_{i}$ .

$\begin{matrix} (21) & \begin{aligned} E (x) & = \sum_{i} e_{i}^{T} Ω_{i} e_{i} \\ x^{*} & = \arg min E (x) \end{aligned} \end{matrix}$

2.3. Non-linear least squares

The final optimization equation to be solved is:

$\begin{matrix} (22) & \begin{array}{r} x^{*} = \arg min E (x) = \arg min \sum_{i} e_{i}^{T} Ω_{i} e_{i} \end{array} \end{matrix}$

In the above formula, we need to find the optimal parameter $x^{*}$ that minimizes the error. However, the above formula does not have a closed-form solution because it usually includes non-linear terms for rotation in SLAM. Therefore, it is necessary to solve the problem using nonlinear optimization methods (Gauss-Newton (GN), Levenberg-Marquardt (LM)). Among the actual implemented SLAM codes, information matrix $Ω_{i}$ is often set to $I_{3}$ to find the optimal value for $e_{i}^{⊺} e_{i}$ .

For example, suppose we solve the problem using the GN method. The order of solving the problem is as follows.

define the error function
Approximate linearization with Taylor expansion
Set to 0 after the first derivative.
At this time, find the value and substitute it into the error function.
Repeat until the values converge.

The detailed expression of the error function $e$ is equivalent to $e (x)$ , which means that the value of the error function varies depending on the robot's pose vector $x$ . do. The GN method iteratively updates the incremental amount $Δ x$ in the direction of decreasing error in $e (x)$ .
$\begin{matrix} (23) & \begin{array}{r} e (x + Δ x)^{⊺} Ω e (x + Δ x) \end{array} \end{matrix}$

At this time, if $e (x + Δ x)$ is used for the first-order Taylor expansion around $x$ , the above expression is approximated as follows.
$\begin{matrix} (24) & \begin{aligned} e (x + Δ x) |_{x} & \approx e (x) + J (x + Δ x - x) \\ = e (x) + J Δ x \end{aligned} \end{matrix}$

In this case, $J = \frac{\partial e (x + Δ x)}{\partial x}$ . Applying this to the entire error function gives:
$\begin{matrix} (25) & \begin{array}{r} e (x + Δ x)^{⊺} Ω e (x + Δ x) \approx (e + J Δ x)^{⊺} Ω (e + J Δ x) \end{array} \end{matrix}$

Expanding the above expression and substituting it gives:

$\begin{matrix} (26) & \begin{aligned} = \underset{c}{\underset{⏟}{e^{⊺} Ω e}} + 2 \underset{b}{\underset{⏟}{e^{⊺} Ω J}} Δ x + Δ x^{⊺} \underset{H}{\underset{⏟}{J^{⊺} Ω J}} Δ x \\ = c + 2 b Δ x + Δ x^{⊺} H Δ x \end{aligned} \end{matrix}$

Applying this to the total error gives:

$\begin{matrix} (27) & \begin{array}{r} E (x + Δ x) = \sum_{i} e_{i}^{⊺} Ω_{i} e_{i} = c + 2 b Δ x + Δ x^{T} H Δ x \end{array} \end{matrix}$

$E (x + Δ x)$ is the quadratic form of $Δ x$ and $H = J^{⊺} Ω J$ is a positive definite matrix, the first derivative of $E (x + Δ x)$ is reduced to zero. The set value becomes the minimum of $Δ x$ .

$\begin{matrix} (28) & \begin{array}{r} \frac{\partial E (x + Δ x)}{\partial Δ x} \approx 2 b + 2 H Δ x = 0 \end{array} \end{matrix}$

Summarizing this, the following formula is derived:
$\begin{matrix} (29) & \begin{array}{r} H Δ x = - b \end{array} \end{matrix}$

$Δ x = - H^{- 1} b$ obtained in this way is updated to $x$ .

$\begin{matrix} (30) & \begin{array}{r} x \leftarrow x + Δ x \end{array} \end{matrix}$

The algorithm that performs the process iteratively so far is called the Gauss-Newton method. Compared to the GN method, the LM method has the same overall process, but a damping factor $λ$ term is added in the formula for calculating the increment.

$\begin{matrix} (31) & \begin{aligned} (GN) H Δ x = - b \\ (LM) (H + λ I) Δ x = - b \end{aligned} \end{matrix}$

3. Reprojection error

Reprojection error is an error mainly used in feature-based Visual SLAM. It is mainly used when performing visual odometry (VO) or bundle adjustment (BA) based on feature-based methods. For more information about BA, see [SLAM] Bundle Adjustment 개념 리뷰 post

NOMENCLATURE of reprojection error

$\tilde{p} = π_{h} (\cdot)$
- A non-homogeneous transformation of the point $X^{'}$ in 3-dimensional space for projection onto the image plane.
- $\tilde{p} : [\begin{matrix} X^{'} \\ Y^{'} \\ Z^{'} \\ 1 \end{matrix}] \to \frac{1}{Z^{'}} {\tilde{X}}^{'} = [\begin{matrix} X^{'} / Z^{'} \\ Y^{'} / Z^{'} \\ 1 \end{matrix}] = [\begin{matrix} \tilde{u} \\ \tilde{v} \\ 1 \end{matrix}]$
$\hat{p} = π_{k} (\cdot)$
- A point projected onto the image plane after correcting for lens distortion. If distortion correction has already been performed at the input stage, $π_{k} (\cdot) = \tilde{K} (\cdot)$ .
- $\hat{p} = \tilde{K} \tilde{p} = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \end{matrix}] [\begin{matrix} \tilde{u} \\ \tilde{v} \\ 1 \end{matrix}] = [\begin{matrix} f \tilde{u} + c_{x} \\ f \tilde{v} + c_{y} \end{matrix}] = [\begin{matrix} u \\ v \end{matrix}]$
$K = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \\ 0 & 0 & 1 \end{matrix}]$ : Camera intrinsic matrix
$\tilde{K} = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \end{matrix}]$ : I omitted the last line of internal parameters to project to $P^{2} \to R^{2}$ .
$X = [T_{1}, \dots, T_{m}, X_{1}, \dots, X_{n}]^{⊺}$ : State variable of model
$m$ : number of camera poses
$n$ : number of 3D points
$T_{i} = [R_{i}, t_{i}]$
$e_{i j} = e_{i j} (X)$ : $X$ is sometimes omitted for brevity.
$p_{i j}$ : The pixel coordinates of the observed load
${\hat{p}}_{i j}$ : Pixel Coordinates of Estimated Feature Points
$T_{i} X_{j}$ : Transform the 3D point $X_{j}$ into the camera coordinate system ${i}$ , $(T_{i} X_{j} = [\begin{matrix} R_{i} X_{j} + t_{i} \\ 1 \end{matrix}] \in R^{4 \times 1})$
- $X^{'} = TX = [X^{'}, Y^{'}, Z^{'}, 1]^{⊺} = [{\tilde{X}}^{'}, 1]^{⊺}$
$\oplus$ : Operator that can update the rotation matrix $R$ , three-dimensional vectors $t, a n d X$ at once.
$J = \frac{\partial e}{\partial X} = \frac{\partial e}{\partial [T, X]}$
$w = {[\begin{matrix} w_{x} & w_{y} & w_{z} \end{matrix}]}^{⊺}$ : Angular velocity
$[w]_{\times} = [\begin{matrix} 0 & - w_{z} & w_{y} \\ w_{z} & 0 & - w_{x} \\ - w_{y} & w_{x} & 0 \end{matrix}]$ : Skew-symmetric matrix of angular velocity $w$

$X_{j}$ is projected onto the image plane through the following transformation.

$\begin{matrix} (32) & \begin{array}{r} projection model: {\hat{p}}_{i j} = π (T_{i}, X_{j}) \end{array} \end{matrix}$

A model that utilizes intrinsic/extrinsic parameters of the camera as above is called a projection model. The reprojection error through this is defined as follows.

$\begin{matrix} (33) & \begin{aligned} e_{i j} & = p_{i j} - {\hat{p}}_{i j} \\ = p_{i j} - π (T_{i}, X_{j}) \\ = p_{i j} - π_{k} (π_{h} (T_{i} X_{j})) \end{aligned} \end{matrix}$

The error function for all camera poses and 3D points is defined as

$\begin{matrix} (34) & \begin{aligned} E (X) & = \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e_{i j} ‖}^{2} \\ = \arg min_{X^{*}} \sum_{i} \sum_{j} e_{i j}^{⊺} e_{i j} \\ = \arg min_{X^{*}} \sum_{i} \sum_{j} (p_{i j} - {\hat{p}}_{i j})^{⊺} (p_{i j} - {\hat{p}}_{i j}) \end{aligned} \end{matrix}$

${‖ e (X^{*}) ‖}^{2}$ that satisfies $E (X^{*})$ can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating $X$ in small increments $Δ X$ .
$\begin{matrix} (35) & \begin{aligned} E (X + Δ X) & = \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e (X + Δ X) ‖}^{2} \end{aligned} \end{matrix}$

Strictly speaking, since the state increment $Δ X$ includes the SO(3) rotation matrix, it is correct to add it to the existing state $X$ through the $\oplus$ operator, but for convenience of expression The $+$ operator is used.

$\begin{matrix} (36) & \begin{array}{r} e (X \oplus Δ X) \to e (X + Δ X) \end{array} \end{matrix}$

The above equation can be expressed as follows through Taylor's first-order approximation.

$\begin{matrix} (37) & \begin{aligned} e (X + Δ X) & \approx e (X) + J Δ X \\ = e (X) + J_{c} Δ T + J_{p} Δ X \\ = e (X) + \frac{\partial e}{\partial T} Δ T + \frac{\partial e}{\partial X} Δ X \end{aligned} \end{matrix}$

$\begin{matrix} (38) & \begin{aligned} E (X + Δ X) & \approx \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e (X) + J Δ X ‖}^{2} \end{aligned} \end{matrix}$

By differentiating this, the optimal value of increment $Δ X^{*}$ is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section
$\begin{matrix} (39) & \begin{aligned} J^{⊺} J Δ X^{*} = - J^{⊺} e \\ H Δ X^{*} = - b \end{aligned} \end{matrix}$

Since the above equation is in the form of a linear system $Ax = b$ , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find $Δ X^{*}$ . At this time, $t$ and $X$ out of the existing states $X$ exist in the linear vector space, so there is no difference whether adding from the right side or from the left side. However, since the rotation matrix $R$ belongs to the nonlinear SO(3) family, depending on whether it is multiplied by the right or left, the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system is updated. (left) will change. Reprojection errors update the global coordinate system's transformation matrix, so we usually use the left multiplication method.

$\begin{matrix} (40) & \begin{array}{r} X \leftarrow X \oplus Δ X^{*} \end{array} \end{matrix}$

Since $X$ is composed of $[T, X]$ , it can be written as follows.

$\begin{matrix} (41) & \begin{aligned} T \leftarrow T & \oplus Δ T^{*} \\ X \leftarrow X & \oplus Δ X^{*} \end{aligned} \end{matrix}$

The definition of the left multiplication $\oplus$ operation is as follows.

$\begin{matrix} (42) & \begin{aligned} R \oplus Δ R^{*} & = Δ R^{*} R \\ = \exp ([Δ w^{*}]_{\times}) R \dots globally updated (left mult) \\ t \oplus Δ t^{*} & = t + Δ t^{*} \\ X \oplus Δ X^{*} & = X + Δ X^{*} \end{aligned} \end{matrix}$

3.1. Jacobian of the reprojection error

3.1.1. Jacobian of camera pose

The Jacobian $J_{c}$ for a pose can be decomposed as:
$\begin{matrix} (43) & \begin{aligned} J_{c} = \frac{\partial e}{\partial T} & = \frac{\partial}{\partial T} (p - \hat{p}) \\ = \frac{\partial}{\partial T} (p - π_{k} (π_{h} (T_{i} X_{j}))) \\ = \frac{\partial}{\partial T} (- π_{k} (π_{h} (T_{i} X_{j}))) \end{aligned} \end{matrix}$

By rearranging the above equation using the chain rule, we get: At this time, for convenience, it is expressed as $T_{i} X_{j} \to X^{'}$ .

$\begin{matrix} (44) & \begin{aligned} J_{c} & = \frac{\partial \hat{p}}{\partial \tilde{p}} \frac{\partial \tilde{p}}{\partial X^{'}} \frac{\partial X^{'}}{\partial [w, t]} \\ = R^{2 \times 3} \cdot R^{3 \times 4} \cdot R^{4 \times 6} = R^{2 \times 6} \end{aligned} \end{matrix}$

At this time, the reason why the Jacobian $\frac{\partial X^{'}}{\partial w}$ for the angular velocity $w$ instead of the Jacobian $\frac{\partial X^{'}}{\partial R}$ for the rotation matrix $R$ will be explained in the next section. In addition, the sign of $J_{c}$ is also changed depending on whether the error is defined as $p - \hat{p}$ or $\hat{p} - p$ , so this should be applied with care when implementing the actual code. In the data, the sign was considered and marked as $+$ .

Assuming that undistortion has already been performed during the image input process, $\frac{\partial \hat{p}}{\partial \tilde{p}}$ is as follows.
$\begin{matrix} (45) & \begin{aligned} \frac{\partial \hat{p}}{\partial \tilde{p}} & = \frac{\partial}{\partial \tilde{p}} \tilde{K} \tilde{p} \\ = \tilde{K} \\ = [\begin{array}{c} f & 0 & c_{x} \\ 0 & f & c_{y} \end{array}] \in R^{2 \times 3} \end{aligned} \end{matrix}$

Next, $\frac{\partial \tilde{p}}{\partial X^{'}}$ is:
$\begin{matrix} (46) & \begin{aligned} \frac{\partial \tilde{p}}{\partial X^{'}} & = \frac{\partial [\tilde{u}, \tilde{v}, 1]}{\partial [X^{'}, Y^{'}, Z^{'}, 1]} \\ = [\begin{array}{c} \frac{1}{Z^{'}} & 0 & \frac{- X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{1}{Z^{'}} & \frac{- Y^{'}}{Z^{' 2}} & 0 \\ 0 & 0 & 0 & 0 \end{array}] \in R^{3 \times 4} \end{aligned} \end{matrix}$

Next, we need to find $\frac{\partial X^{'}}{\partial t}$ . This can be obtained relatively simply as follows.
$\begin{matrix} (47) & \begin{aligned} \frac{\partial X^{'}}{\partial t} & = \frac{\partial}{\partial [t_{x}, t_{y}, t_{z}]} [\begin{array}{c} R X + t \\ 1 \end{array}] \\ = \frac{\partial}{\partial [t_{x}, t_{y}, t_{z}]} [\begin{array}{c} t \\ 0 \end{array}] \\ = \frac{\partial}{\partial [t_{x}, t_{y}, t_{z}]} ([\begin{array}{c} t_{x} \\ t_{y} \\ t_{z} \\ 0 \end{array}]) \\ = [\begin{array}{c} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}] \in R^{4 \times 3} \end{aligned} \end{matrix}$

3.1.2. Lie theory-based SO(3) optmization

Finally, we need to find $\frac{\partial X^{'}}{\partial w}$ . At this time, rotation-related parameters are expressed as angular velocity $w$ rather than rotation matrix $R$ . The rotation matrix $R$ is over-parameterized because the number of parameters is 9, but the actual rotation is limited to 3 degrees of freedom. Disadvantages of over-parameterized representation are as follows.

Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.

Lie theory allows optimization to be free from constraints. Therefore, by using lie algebra so(3) $w$ instead of lie group SO(3) $R$ , parameters can be updated freely from constraints.

$\begin{matrix} (48) & \begin{array}{r} J_{c} = \frac{\partial e}{\partial [R, t]} \to \frac{\partial e}{\partial [w, t]} \end{array} \end{matrix}$

However, since $w$ is not immediately visible in $X^{'}$ , $X^{'}$ must be expressed as lie algebra.

At this time, since we need to find the Jacobian for the $w$ term related to rotation, we translate the 3D point $X_{t}$ by $t$ and $X^{'}$ Let be the point where the $X_{t}$ at the same position is rotated by $R$ .

$\begin{matrix} (49) & \begin{aligned} X_{t} & = X + t \\ X^{'} & = R X_{t} \\ = \exp ([w] \times) X_{t} \end{aligned} \end{matrix}$

$\exp ([w]_{\times}) \in S O (3)$ transforms the angular velocity $w$ into a 3D rotation matrix $R$ by exponential mapping refers to an operation that For more information on exponential mapping, refer to this link.

$\begin{matrix} (50) & \begin{array}{r} \exp ([w]_{\times}) = R \end{array} \end{matrix}$

At this time, there are two ways to update the small lie algebra increment $Δ w$ to the existing $\exp ([w]_{\times})$ . First of all, there is an update method using $[1]$ basic lie algebra. Next, there is an update method using the $[2]$ perturbation model.

$\begin{matrix} (51) & \begin{aligned} \exp ([w]_{\times}) & \leftarrow \exp ([w + Δ w]_{\times}) \dots [1] \\ \exp ([w]_{\times}) & \leftarrow \exp ([Δ w]_{\times}) \exp ([w]_{\times}) \dots [2] \end{aligned} \end{matrix}$

The relationship between the above two methods is as follows. For details, refer to chapter 4.3.3 of this link.

$\begin{matrix} (52) & \begin{aligned} \exp ([Δ w]_{\times}) \exp ([w]_{\times}) = \exp ([w + J_{l}^{- 1} Δ w]_{\times}) \\ \exp ([w + Δ w]_{\times}) = \exp ([J_{l} Δ w]_{\times}) \exp ([w]_{\times}) \end{aligned} \end{matrix}$

$[1]$ Lie algebra-based update: First of all, if you directly calculate $\frac{\partial R X_{t}}{\partial w}$ using the $[1]$ method, you will get the following A complex diet is induced.

$\begin{matrix} (53) & \begin{aligned} \frac{\partial R X_{t}}{\partial w} & = lim_{Δ w \to 0} \frac{\exp ([w + Δ w]_{\times}) X_{t} - \exp ([w]_{\times}) X_{t}}{Δ w} \\ \approx lim_{Δ w \to 0} \frac{\exp ([J_{l} Δ w]_{\times}) (\exp ([w]_{\times}) X_{t} - \exp ([w]_{\times}) X_{t}}{Δ w} \\ \approx lim_{Δ w \to 0} \frac{(I + [J_{l} Δ w]_{\times}) (\exp ([w]_{\times}) X_{t} - \exp ([w]_{\times}) X_{t}}{Δ w} \\ = lim_{Δ w \to 0} \frac{[J_{l} Δ w]_{\times} R X_{t}}{Δ w} (∵ \exp ([w]_{\times}) X_{t} = R X_{t}) \\ = lim_{Δ w \to 0} \frac{- [R X_{t}]_{\times} J_{l} Δ w}{Δ w} \\ = - [R X_{t}]_{\times} J_{l} \\ = - [X^{'}]_{\times} J_{l} \end{aligned} \end{matrix}$

In the above equation, the second row is a form in which the left Jacobian $J_{l}$ is derived using the BCH approximation, and the third row is a form in which the first-order Taylor approximation is applied for a small rotation amount $\exp ([J_{l} Δ w]_{\times})$ . For more information about $J_{l}$ , refer toChapter 4 of Introduction of Visual SLAM

To understand the approximation of the third row, given an arbitrary rotation vector $w = [w_{x}, w_{y}, w_{z}]^{⊺}$ , the rotation matrix is exponential When developed in mapping form, it can be expressed as follows.

$\begin{matrix} (54) & \begin{array}{r} R = \exp ([w]_{\times}) = I + [w]_{\times} + \frac{1}{2} [w]_{\times}^{2} + \frac{1}{3!} [w]_{\times}^{3} + \frac{1}{4!} [w]_{\times}^{4} + \dots \end{array} \end{matrix}$

For a small-sized rotation matrix $Δ R$ , it can be approximated as follows by ignoring higher-order terms of order 2 or higher. $\begin{matrix} (55) & \begin{array}{r} Δ R \approx I + [Δ w]_{\times} \end{array} \end{matrix}$

$[2]$ Perturbation model-based update: In order to obtain a simpler Jacobian without using $J_{l}$ , the perturbation model of $[2]$ lie algebra so(3) is generally used. The Jacobian $\frac{\partial R X_{t}}{\partial Δ w}$ is obtained using the perturbation model as follows:

$\begin{matrix} (56) & \begin{aligned} \frac{\partial R X_{t}}{\partial Δ w} & = lim_{Δ w \to 0} \frac{\exp ([Δ w]_{\times}) \exp ([w]_{\times}) X_{t} - \exp ([w]_{\times}) X_{t}}{Δ w} \\ \approx lim_{Δ w \to 0} \frac{(I + [Δ w]_{\times}) \exp ([w]_{\times}) X_{t} - \exp ([w]_{\times}) X_{t}}{Δ w} \\ = lim_{Δ w \to 0} \frac{[Δ w]_{\times} R X_{t}}{Δ w} (∵ \exp ([w]_{\times}) X_{t} = R X_{t}) \\ = lim_{Δ w \to 0} \frac{- [R X_{t}]_{\times} Δ w}{Δ w} \\ = - [R X_{t}]_{\times} \\ = - [X^{'}]_{\times} \end{aligned} \end{matrix}$

The above equation also uses approximate $\exp ([Δ w]_{\times}) \approx I + [Δ w]_{\times}$ for a small rotation matrix in the second row. Therefore, when using the $[2]$ perturbation model, there is an advantage in that the Jacobian can be obtained simply by using an skew-symmetric matrix of the points $X^{'}$ in the 3D space. In the case of reprojection error optimization, since most of the errors for the feature points of sequentially incoming images are optimized, the camera pose change is not large and the size of $Δ w$ is not large, so the Jacobian above is generally used. Since the $[2]$ method is used, when updating the small increment $Δ w$ to the existing rotation matrix $R$ , it is updated as ( $42$ ).

$\begin{matrix} (57) & \begin{array}{r} R \leftarrow Δ R^{*} R where, Δ R^{*} = \exp ([Δ w^{*}]_{\times}) \end{array} \end{matrix}$

So the original Jacobian is $\frac{\partial X^{'}}{\partial [w, t]}$ to $\frac{\partial X^{'}}{\partial [Δ w, t]}$ , which looks like this:

$\begin{matrix} (58) & \begin{aligned} \frac{\partial}{\partial [Δ w, t]} [\begin{array}{c} RX + t \\ 1 \end{array}] & = [\begin{array}{c} 0 & Z^{'} & - Y^{'} & 1 & 0 & 0 \\ - Z^{'} & 0 & X^{'} & 0 & 1 & 0 \\ Y^{'} & - X^{'} & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}] \in R^{4 \times 6} \end{aligned} \end{matrix}$

The Jacobian $J_{c}$ for the final pose is
$\begin{matrix} (59) & \begin{aligned} J_{c} & = \frac{\partial \hat{p}}{\partial \tilde{p}} \frac{\partial \tilde{p}}{\partial X^{'}} \frac{\partial X^{'}}{\partial [Δ w, t]} \\ = [\begin{array}{c} f & 0 & c_{x} \\ 0 & f & c_{y} \end{array}] [\begin{array}{c} \frac{1}{Z^{'}} & 0 & \frac{- X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{1}{Z^{'}} & \frac{- Y^{'}}{Z^{' 2}} & 0 \\ 0 & 0 & 0 & 0 \end{array}] [\begin{array}{c} 0 & Z^{'} & - Y^{'} & 1 & 0 & 0 \\ - Z^{'} & 0 & X^{'} & 0 & 1 & 0 \\ Y^{'} & - X^{'} & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}] \\ = [\begin{array}{c} \frac{f}{Z^{'}} & 0 & - \frac{f X}{Z^{' 2}} & 0 \\ 0 & \frac{f}{Z^{'}} & - \frac{f Y}{Z^{' 2}} & 0 \end{array}] [\begin{array}{c} 0 & Z^{'} & - Y^{'} & 1 & 0 & 0 \\ - Z^{'} & 0 & X^{'} & 0 & 1 & 0 \\ Y^{'} & - X^{'} & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}] \\ = [\begin{array}{c} - \frac{f X^{'} Y^{'}}{Z^{' 2}} & \frac{f (1 + X^{' 2})}{Z^{' 2}} & - \frac{f Y^{'}}{Z^{'}} & \frac{f}{Z^{'}} & 0 & - \frac{f X^{'}}{Z^{' 2}} \\ - \frac{f (1 + y^{2})}{Z^{' 2}} & \frac{f X^{'} Y^{'}}{Z^{' 2}} & \frac{f X^{'}}{Z^{'}} & 0 & \frac{f}{Z^{'}} & - \frac{f Y^{'}}{Z^{' 2}} \end{array}] \in R^{2 \times 6} \end{aligned} \end{matrix}$

3.1.3. Code implementations

g2o code: edge_project_xyz.cpp#L82

3.2. Jacobian of Map Point

The Jacobian $J_{p}$ for a 3D point $X$ can be obtained as follows.

$\begin{matrix} (60) & \begin{aligned} J_{p} = \frac{\partial e}{\partial X} & = \frac{\partial}{\partial X} (p - \hat{p}) \\ = \frac{\partial}{\partial X} (p - π_{k} (π_{h} (T_{i} X_{j}))) \\ = \frac{\partial}{\partial X} (- π_{k} (π_{h} (T_{i} X_{j}))) \end{aligned} \end{matrix}$

By rearranging the above equation using the chain rule, we get:
$\begin{matrix} (61) & \begin{aligned} J_{p} & = \frac{\partial \hat{p}}{\partial \tilde{p}} \frac{\partial \tilde{p}}{\partial X^{'}} \frac{\partial X^{'}}{\partial X} \\ = R^{2 \times 3} \cdot R^{3 \times 4} \cdot R^{4 \times 4} = R^{2 \times 4} \end{aligned} \end{matrix}$

Of these, $\frac{\partial \hat{p}}{\partial \tilde{p}} \frac{\partial \hat{p}}{\partial X^{'}}$ is the same as the Jacobian obtained earlier. So we only need to calculate $\frac{\partial X^{'}}{\partial X}$ .

$\begin{matrix} (62) & \begin{aligned} \frac{\partial X^{'}}{\partial X} & = \frac{\partial}{\partial X} [\begin{array}{c} R X + t \\ 1 \end{array}] \\ = [\begin{array}{c} R \\ 0 \end{array}] \end{aligned} \end{matrix}$

Therefore, $J_{p}$ is
$\begin{matrix} (63) & \begin{aligned} J_{p} & = [\begin{array}{c} f & 0 & c_{x} \\ 0 & f & c_{y} \end{array}] [\begin{array}{c} \frac{1}{Z^{'}} & 0 & \frac{- X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{1}{Z^{'}} & \frac{- Y^{'}}{Z^{' 2}} & 0 \\ 0 & 0 & 0 & 0 \end{array}] [\begin{array}{c} R \\ 0 \end{array}] \\ = [\begin{array}{c} \frac{f}{Z^{'}} & 0 & - \frac{f X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{f}{Z^{'}} & - \frac{f Y^{'}}{Z^{' 2}} & 0 \end{array}] [\begin{array}{c} R \\ 0 \end{array}] \in R^{2 \times 4} \end{aligned} \end{matrix}$

In general, since the last column of $J_{p}$ is always 0, it is sometimes omitted to represent non-homogeneous form.

$\begin{matrix} (64) & \begin{aligned} J_{p} & = [\begin{array}{c} \frac{f}{Z^{'}} & 0 & - \frac{f X^{'}}{Z^{' 2}} \\ 0 & \frac{f}{Z^{'}} & - \frac{f Y^{'}}{Z^{' 2}} \end{array}] R \in R^{2 \times 3} \end{aligned} \end{matrix}$

3.2.1. Code implementations

g2o code: edge_project_xyz.cpp#L80

4. Photometric error

Photometric error is an error mainly used in direct Visual SLAM. It is mainly used when performing direct method-based visual odometry (VO) or bundle adjustment (BA). For more information about direct methods, see [SLAM] Optical Flow와 Direct Method 개념 및 코드 리뷰 post.

NOMENCLATURE of photometric error

${\tilde{p}}_{2} = π_{h} (\cdot)$
- A non-homogeneous transformation of the point $X^{'}$ in 3-dimensional space for projection onto the image plane.
- ${\tilde{p}}_{2} : [\begin{matrix} X^{'} \\ Y^{'} \\ Z^{'} \\ 1 \end{matrix}] \to \frac{1}{Z^{'}} {\tilde{X}}^{'} = [\begin{matrix} X^{'} / Z^{'} \\ Y^{'} / Z^{'} \\ 1 \end{matrix}] = [\begin{matrix} {\tilde{u}}_{2} \\ {\tilde{v}}_{2} \\ 1 \end{matrix}]$
$p_{2} = π_{k} (\cdot)$
- Projection onto the image plane after correcting for lens distortion. If distortion correction has already been performed at the input stage, then $π_{k} (\cdot) = \tilde{K} (\cdot)$ .
- $p_{2} = \tilde{K} {\tilde{p}}_{2} = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \end{matrix}] [\begin{matrix} {\tilde{u}}_{2} \\ {\tilde{v}}_{2} \\ 1 \end{matrix}] = [\begin{matrix} f \tilde{u} + c_{x} \\ f \tilde{v} + c_{y} \end{matrix}] = [\begin{matrix} u_{2} \\ v_{2} \end{matrix}]$
$K = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \\ 0 & 0 & 1 \end{matrix}]$ : Camera intrinsic matrix
$\tilde{K} = [\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \end{matrix}]$ : I omitted the last line of internal parameters to project to $P^{2} \to R^{2}$
$P$ : The set of all feature points in the image
$e (T) \to e$ : In general, it is sometimes indicated simply by omitting the notation.
$p_{1}^{i}, p_{2}^{i}$ : Pixel coordinates of the ith feature point in the first image and the second image
$\oplus$ : operator for composition of two SE(3) group
$J = \frac{\partial e}{\partial T} = \frac{\partial e}{\partial [R, t]}$
$X^{'} = [X, Y, Z, 1]^{⊺} = [\tilde{X^{'}}, 1]^{⊺} = TX$
$T X$ : Transform 3D point $X$ to camera coordinate system, $(T X = [\begin{matrix} R X + t \\ 1 \end{matrix}] \in R^{4 \times 1})$
$X^{'} = [X^{'}, Y^{'}, Z^{'}, 1]^{⊺} = [{\tilde{X}}^{'}, 1]^{⊺}$
$ξ = [w, v]^{⊺} = [w_{x}, w_{y}, w_{z}, v_{x}, v_{y}, v_{z}]^{⊺}$ : A vector of three-dimensional angular velocity and velocity. It's called a twist.
$[ξ]_{\times} = [\begin{matrix} [w]_{\times} & v \\ 0^{⊺} & 0 \end{matrix}] \in se (3)$ : Twist's lie algebra with hat operator applied (4x4 matrix)
$J_{l}$ : Jacobian for left multiplication. Since it is not used in actual calculations, it will not be introduced in detail.

In the figure above, the world coordinates of the 3D point $X$ are $[X, Y, Z, 1]^{⊺} \in P^{3}$ and the corresponding two cameras The pixel coordinates on the image plane are $p_{1}, p_{2} \in P^{2}$ . At this time, it is assumed that the internal parameters $K$ of the two cameras ${C_{1}}$ and ${C_{2}}$ are the same. When camera ${C_{1}}$ is the origin( $R = I, t = 0$ ) , if pixel coordinates $p_{1}, p_{2}$ are expressed through the 3D point $X$ , they are projected in the following order.

$\begin{matrix} (65) & \begin{array}{r} p = π (T, X) \end{array} \end{matrix}$

$\begin{matrix} (66) & \begin{aligned} p_{1} = (\begin{array}{c} u_{1} \\ v_{1} \end{array}) & = π (I, X) = π_{k} (π_{h} (X)) \\ p_{2} = (\begin{array}{c} u_{2} \\ v_{2} \end{array}) & = π (T, X) = π_{k} (π_{h} (TX)) \end{aligned} \end{matrix}$

One of the characteristics of direct method is that unlike feature-based, there is no way to know which $p_{2}$ matches $p_{1}$ . So based on the current pose estimate, we find the position of $p_{2}$ . That is, $p_{2}$ and $p_{1}$ are made similar by optimizing the pose of the camera. At this time, the problem is solved by minimizing the photometric error. The photometric error is:

$\begin{matrix} (67) & \begin{aligned} e (T) & = I_{1} (p_{1}) - I_{2} (p_{2}) \\ = I_{1} (π_{k} (π_{h} (X))) - I_{2} (π_{k} (π_{h} (TX))) \end{aligned} \end{matrix}$

The photometric error is based on the assumption of grayscale invariance and is scalar. To solve non-linear least squares with photometric error, we can define the following error function $E (T)$ .

$\begin{matrix} (68) & \begin{aligned} E (T) & = \arg min_{T^{*}} \sum_{i \in P} {‖ e_{i} ‖}^{2} \\ = \arg min_{T^{*}} \sum_{i \in P} e_{i}^{⊺} e_{i} \\ = \arg min_{T^{*}} \sum_{i \in P} (I_{1} (p_{1}^{i}) - I_{2} (p_{2}^{i}))^{⊺} (I_{1} (p_{1}^{i}) - I_{2} (p_{2}^{i})) \end{aligned} \end{matrix}$

${‖ e (T^{*}) ‖}^{2}$ that satisfies $E (T^{*})$ can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating $T$ in small increments $Δ T$ .
$\begin{matrix} (69) & \begin{aligned} E (T + Δ T) & = \arg min_{T^{*}} \sum_{i \in P} {‖ e_{i} (T + Δ T) ‖}^{2} \end{aligned} \end{matrix}$

Strictly speaking, since the state increment $Δ T$ is an SE(3) conversion matrix, it is correct to add it to the existing state $T$ through the $\oplus$ operator, but for convenience of expression, The $+$ operator is used.

$\begin{matrix} (70) & \begin{array}{r} T \oplus Δ T \to T + Δ T \end{array} \end{matrix}$

This can be expressed as follows through the first-order Taylor approximation.
$\begin{matrix} (71) & \begin{aligned} e (T + Δ T) & \approx e_{i} (T) + J Δ T \\ = e_{i} (T) + \frac{\partial e}{\partial T} Δ T \end{aligned} \end{matrix}$

$\begin{matrix} (72) & \begin{aligned} E (T + Δ T) & = \arg min_{T^{*}} \sum_{i \in P} {‖ e_{i} (T) + J Δ T ‖}^{2} \end{aligned} \end{matrix}$

By differentiating this, the optimal value of increment $Δ T^{*}$ is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section.
$\begin{matrix} (73) & \begin{aligned} J^{⊺} J Δ T^{*} = - J^{⊺} e \\ H Δ T^{*} = - b \end{aligned} \end{matrix}$

Since the above equation is in the form of a linear system $Ax = b$ , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find $Δ T^{*}$ . The optimal increment obtained in this way is added to the current state. At this time, whether to update the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system (left) depends on whether the existing $T$ is multiplied to the right or to the left. will lose Photometric errors update the transformation matrix in the global coordinate system, so the left multiplication method is generally used.
$\begin{matrix} (74) & \begin{array}{r} T \leftarrow T \oplus Δ T^{*} \end{array} \end{matrix}$

The definition of the left multiplication $\oplus$ operation is as follows.

$\begin{matrix} (75) & \begin{aligned} T \oplus Δ T^{*} & = Δ T^{*} T \\ = \exp ([Δ ξ^{*}]_{\times}) T \dots globally updated (left mult) \end{aligned} \end{matrix}$

4.1. Jacobian of the photometric error

To perform ( $73$ ), we need to find the Jacobian $J$ for the photometric error. This can be expressed as:

$\begin{matrix} (76) & \begin{aligned} J & = \frac{\partial e}{\partial T} \\ = \frac{\partial e}{\partial [R, t]} \end{aligned} \end{matrix}$

Here's a closer look at this:

$\begin{matrix} (77) & \begin{aligned} J = \frac{\partial e}{\partial T} & = \frac{\partial}{\partial T} (I_{1} (p_{1}) - I_{2} (p_{2})) \\ = \frac{\partial}{\partial T} (I_{1} (π_{k} (π_{h} (X))) - I_{2} (π_{k} (π_{h} (TX)))) \\ = \frac{\partial}{\partial T} (- I_{2} (π_{k} (π_{h} (TX)))) \\ = \frac{\partial}{\partial T} (- I_{2} (π_{k} (π_{h} (X^{'})))) \end{aligned} \end{matrix}$

Re-expressing the above expression by applying the chain rule gives:

$\begin{matrix} (78) & \begin{aligned} \frac{\partial e}{\partial ξ} & = \frac{\partial I}{\partial p_{2}} \frac{\partial p_{2}}{\partial {\tilde{p}}_{2}} \frac{\partial {\tilde{p}}_{2}}{\partial X^{'}} \frac{\partial X^{'}}{\partial ξ} \\ = R^{1 \times 2} \cdot R^{2 \times 3} \cdot R^{3 \times 4} \cdot R^{4 \times 6} = R^{1 \times 6} \end{aligned} \end{matrix}$

At this time, the Jacobian $\frac{\partial X^{'}}{\partial T}$ for the transformation matrix $T$ is not obtained, but for the twist $ξ$ The reason for finding the Jacobian $\frac{\partial X^{'}}{\partial ξ}$ is explained in the next section. First of all, $\frac{\partial I}{\partial p_{2}}$ means the gradient of the image.

$\begin{matrix} (79) & \begin{aligned} \frac{\partial I}{\partial p_{2}} & = [\begin{array}{c} \frac{\partial I}{\partial u} & \frac{\partial I}{\partial v} \end{array}] \\ = [\begin{array}{c} \nabla I_{u} & \nabla I_{v} \end{array}] \end{aligned} \end{matrix}$

Assuming that undistortion has already been performed during the image input process, $\frac{\partial p_{2}}{\partial {\tilde{p}}_{2}}$ is as follows.
$\begin{matrix} (80) & \begin{aligned} \frac{\partial p_{2}}{\partial {\tilde{p}}_{2}} & = \frac{\partial}{\partial {\tilde{p}}_{2}} \tilde{K} {\tilde{p}}_{2} \\ = \tilde{K} \\ = [\begin{array}{c} f & 0 & c_{x} \\ 0 & f & c_{y} \end{array}] \in R^{2 \times 3} \end{aligned} \end{matrix}$

Next, $\frac{\partial {\tilde{p}}_{2}}{\partial X^{'}}$ is:
$\begin{matrix} (81) & \begin{aligned} \frac{\partial {\tilde{p}}_{2}}{\partial X^{'}} & = \frac{\partial [{\tilde{u}}_{2}, {\tilde{v}}_{2}, 1]}{\partial [X^{'}, Y^{'}, Z^{'}, 1]} \\ = [\begin{array}{c} \frac{1}{Z^{'}} & 0 & \frac{- X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{1}{Z^{'}} & \frac{- Y^{'}}{Z^{' 2}} & 0 \\ 0 & 0 & 0 & 0 \end{array}] \in R^{3 \times 4} \end{aligned} \end{matrix}$

4.1.1. Lie theory-based SE(3) optimization

Finally $\frac{\partial X^{'}}{\partial T} = \frac{\partial X^{'}}{\partial [R, t]}$ . At this time, since the term $t$ related to the position is a 3D vector and the size of the vector is equal to 3 degrees of freedom, which is the minimum degree of freedom for expressing the 3D position, a separate constraint is required when performing the optimization update. does not exist. On the other hand, the rotation matrix $R$ has 9 parameters, which is more than 3 degrees of freedom, which is the minimum degree of freedom for expressing 3D rotation, so various constraints exist. This is said to be over-parameterized. Disadvantages of over-parameterized representation are as follows.

Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.

Therefore, a lie theory-based optimization method, which is a minimal parameter expression free from constraints, is generally used. Instead of calculating $Δ T^{*}$ , which includes a nonlinear rotation matrix, the lie group SE(3)-based optimization method includes $R \to w$ , and position-related terms are changed to $t \to v$ to find the optimal lie algebra se(3) $Δ ξ^{*}$ , and then exponential mapping Indicates how to update to SE(3).

$\begin{matrix} (82) & \begin{aligned} Δ T^{*} & \to Δ ξ^{*} \end{aligned} \end{matrix}$

The Jacobian for $ξ$ is

$\begin{matrix} (83) & \begin{aligned} J & = \frac{\partial e}{\partial [R, t]} & \to \frac{\partial e}{\partial [w, v]} \\ \to \frac{\partial e}{\partial ξ} \end{aligned} \end{matrix}$

Through this, the existing expression is changed as follows.

$\begin{matrix} (84) & \begin{aligned} e (T) & \to e (ξ) \\ E (T) & \to E (ξ) \\ e (T) + J^{'} Δ T & \to e (ξ) + J Δ ξ \\ H Δ T^{*} = - b & \to H Δ ξ^{*} = - b \\ T \leftarrow Δ T^{*} T & \to T \leftarrow \exp ([Δ ξ^{*}]_{\times}) T \end{aligned} \end{matrix}$

- $J^{'} = \frac{\partial e}{\partial T}$

- $J = \frac{\partial e}{\partial ξ}$

$\exp ([ξ]_{\times}) \in SE (3)$ refers to an operation that transforms twist $ξ$ into a 3D pose by exponential mapping. For more information on exponential mapping, refer to this link.

$\begin{matrix} (85) & \begin{array}{r} \exp ([Δ ξ]_{\times}) = Δ T \end{array} \end{matrix}$

So far the Jacobians have been easy to compute, whereas $\frac{\partial X^{'}}{\partial ξ}$ is because the parameter $ξ$ is not immediately visible in $X^{'}$ $X^{'}$ should be changed to terms related to lie algebra.

$\begin{matrix} (86) & \begin{array}{r} X^{'} \to TX \to \exp ([ξ]_{\times}) X \end{array} \end{matrix}$

At this time, there are two ways to update the small lie algebra increment $Δ ξ$ to the existing $\exp ([ξ]_{\times})$ . First of all, [1] There is an update method using basic lie algebra. [2] Next, there is an update method using a perturbation model.

$\begin{matrix} (87) & \begin{aligned} \exp ([ξ]_{\times}) & \leftarrow \exp ([ξ + Δ ξ]_{\times}) \dots [1] \\ \exp ([ξ]_{\times}) & \leftarrow \exp ([Δ ξ]_{\times}) \exp ([ξ]_{\times}) \dots [2] \end{aligned} \end{matrix}$

Among the two methods above, the $[1]$ method adds the fine increment $Δ ξ$ to the existing $ξ$ and then performs exponential mapping to obtain the Jacobian. The $[2]$ method calculates the Jacobian. This method updates the existing state by multiplying the left side of $ξ$ by the perturbation model $\exp ([Δ ξ]_{\times})$ . The following conversion exists between the two methods, which is called the BCH approximation. For details, see Chapter 4 of Introduction to Visual SLAM.

$\begin{matrix} (88) & \begin{aligned} \exp ([Δ ξ]_{\times}) \exp ([ξ]_{\times}) & = \exp ([ξ + J_{l}^{- 1} Δ ξ]_{\times}) \\ \exp ([ξ + Δ ξ]_{\times}) & = \exp ([J_{l} Δ ξ]_{\times}) \exp ([ξ]_{\times}) \end{aligned} \end{matrix}$

Since using the $[1]$ method leads to very complex equations, this method is not often used, and the perturbation model of $[2]$ is mainly used. Therefore, $\frac{\partial X^{'}}{\partial ξ}$ is transformed as follows.

$\begin{matrix} (89) & \begin{aligned} \frac{\partial X^{'}}{\partial ξ} & \to \frac{\partial X^{'}}{\partial Δ ξ} \end{aligned} \end{matrix}$

The Jacobian for $\frac{\partial X^{'}}{\partial Δ ξ}$ can be calculated as

$\begin{matrix} (90) & \begin{aligned} \frac{\partial X^{'}}{\partial Δ ξ} & = lim_{Δ ξ \to 0} \frac{\exp ([Δ ξ]_{\times}) X^{'} - X^{'}}{Δ ξ} \\ \approx lim_{Δ ξ \to 0} \frac{(I + [Δ ξ]_{\times}) X^{'} - X^{'}}{Δ ξ} \\ = lim_{Δ ξ \to 0} \frac{[Δ ξ]_{\times} X^{'}}{Δ ξ} \\ = lim_{Δ ξ \to 0} \frac{[\begin{array}{c} [Δ w]_{\times} & Δ v \\ 0^{⊺} & 0 \end{array}] [\begin{array}{c} {\tilde{X}}^{'} \\ 1 \end{array}]}{Δ ξ} \\ = lim_{Δ ξ \to 0} \frac{[\begin{array}{c} [Δ w]_{\times} {\tilde{X}}^{'} + Δ v \\ 0^{⊺} \end{array}]}{[Δ w, Δ v]^{⊺}} = [\begin{array}{c} - [{\tilde{X}}^{'}]_{\times} & I \\ 0^{⊺} & 0^{⊺} \end{array}] \in R^{4 \times 6} \end{aligned} \end{matrix}$

Therefore, when using the $[2]$ perturbation model, there is an advantage in that the Jacobian can be obtained simply by using the skew-symmetric matrix of the points $X^{'}$ in the 3D space. In the case of photometric error optimization, since most of the errors for the brightness change of sequentially incoming images are optimized, the camera pose change is not large and the size of $Δ ξ$ is not large, so the above Jacobian is generally used. Since we use the $[2]$ perturbation model, the small increment $Δ ξ^{*}$ is updated as ( $75$ ).

$\begin{matrix} (91) & \begin{array}{r} T \leftarrow Δ T^{*} T = \exp ([Δ ξ^{*}]_{\times}) T \end{array} \end{matrix}$

In the above equation, the second row is the first-order Taylor approximation applied to the small twist increment $\exp ([Δ ξ]_{\times})$ . To understand the approximation of the second row, the transformation matrix $T$ is exponential given an arbitrary twist $ξ = [w, v]^{⊺}$ When developed in mapping form, it can be expressed as follows.

$\begin{matrix} (92) & \begin{aligned} T = \exp ([ξ]_{\times}) & = I + [\begin{array}{c} [w]_{\times} & v \\ 0^{⊺} & 0 \end{array}] + \frac{1}{2!} [\begin{array}{c} [w]_{\times}^{2} & [w]_{\times} v \\ 0^{⊺} & 0 \end{array}] + \frac{1}{3!} [\begin{array}{c} [w]_{\times}^{3} & [w]_{\times}^{2} v \\ 0^{⊺} & 0 \end{array}] + \dots \\ = I + [ξ]_{\times} + \frac{1}{2!} [ξ]_{\times}^{2} + \frac{1}{3!} [ξ]_{\times}^{3} + \dots \end{aligned} \end{matrix}$

For a small twist increment $Δ ξ$ , it can be approximated as follows by ignoring higher-order terms of the second or higher order.
$\begin{matrix} (93) & \begin{array}{r} \exp ([Δ ξ]_{\times}) \approx I + [Δ ξ]_{\times} \end{array} \end{matrix}$

Finally, The Jacobian $J$ for the pose is:

$\begin{matrix} (94) & \begin{aligned} J = \frac{\partial e}{\partial Δ ξ} & = \frac{\partial I}{\partial p_{2}} \frac{\partial p_{2}}{\partial {\tilde{p}}_{2}} \frac{\partial {\tilde{p}}_{2}}{\partial X^{'}} \frac{\partial X^{'}}{\partial Δ ξ} \\ = [\begin{array}{c} \nabla I_{u} & \nabla I_{v} \end{array}] [\begin{array}{c} f & 0 & c_{x} \\ 0 & f & c_{y} \end{array}] [\begin{array}{c} \frac{1}{Z^{'}} & 0 & \frac{- X^{'}}{Z^{' 2}} & 0 \\ 0 & \frac{1}{Z^{'}} & \frac{- Y^{'}}{Z^{' 2}} & 0 \\ 0 & 0 & 0 & 0 \end{array}] [\begin{array}{c} - [{\tilde{X}}^{'}]_{\times} & I \\ 0^{⊺} & 0^{⊺} \end{array}] \\ = [\begin{array}{c} \nabla I_{u} & \nabla I_{v} \end{array}] [\begin{array}{c} \frac{f}{Z^{'}} & 0 & - \frac{f X}{Z^{' 2}} & 0 \\ 0 & \frac{f}{Z^{'}} & - \frac{f Y}{Z^{' 2}} & 0 \end{array}] [\begin{array}{c} 0 & Z^{'} & - Y^{'} & 1 & 0 & 0 \\ - Z^{'} & 0 & X^{'} & 0 & 1 & 0 \\ Y^{'} & - X^{'} & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \end{array}] \\ = [\begin{array}{c} \nabla I_{u} & \nabla I_{v} \end{array}] [\begin{array}{c} - \frac{f X^{'} Y^{'}}{Z^{' 2}} & \frac{f (1 + X^{' 2})}{Z^{' 2}} & - \frac{f Y^{'}}{Z^{'}} & \frac{f}{Z^{'}} & 0 & - \frac{f X^{'}}{Z^{' 2}} \\ - \frac{f (1 + Y^{' 2})}{Z^{' 2}} & \frac{f X^{'} Y^{'}}{Z^{' 2}} & \frac{f X^{'}}{Z^{'}} & 0 & \frac{f}{Z^{'}} & - \frac{f Y^{'}}{Z^{' 2}} \end{array}] \in R^{1 \times 6} \end{aligned} \end{matrix}$

At this time, since the last line of $\frac{\partial X^{'}}{\partial Δ ξ}$ is always 0, it is sometimes omitted and calculated.

4.1.2. Code implementations

Code of Chapter 8 of Introduction to Visual SLAM: direct_sparse.cpp#L111
DSO Code: CoarseInitializer.cpp#L430
DSO Code2: CoarseTracker.cpp#L320

5. Relative pose error (PGO)

Relative pose error is an error mainly used in pose graph optimization (PGO). For more information on PGO, refer to [SLAM] Pose Graph Optimization 개념 설명 및 예제 코드 분석 post.

NOMENCLATURE of relative pose error

$(Node) x_{i} = [\begin{matrix} R_{i} & t_{i} \\ 0^{⊺} & 1 \end{matrix}] \in R^{4 \times 4}$
$(Edge) z_{i j} = [\begin{matrix} R_{i j} & t_{i j} \\ 0^{⊺} & 1 \end{matrix}] \in R^{4 \times 4}$
${\hat{z}}_{i j} = x_{i}^{- 1} x_{j}$ : predicted value
$z_{i j}$ : virtual measurement
$x = [x_{1}, \dots, x_{n}]$ : All pose nodes in pose graph
$e_{i j} (x_{i}, x_{j}) \leftrightarrow e_{i j}$ :For convenience of expression, it is sometimes omitted.
$J = \frac{\partial e}{\partial x}$
$\oplus$ : operator for composition of two SE(3) families
$Log (\cdot)$ : Operator that converts SE(3) to twist $ξ \in R^{6}$ . For more information on logarithm mapping, refer to this post.

Given two nodes $x_{i} a n d x_{j}$ on the Pose graph, the relative pose (=observed value) $z_{i j}$ newly calculated by the sensor data and the existing known The difference in relative pose (=predicted value) ${\hat{z}}_{i j}$ is defined as the relative pose error. (figure: Freiburg univ. Robot Mapping Course).

$\begin{matrix} (95) & \begin{array}{r} e_{i j} (x_{i}, x_{j}) = z_{i j}^{- 1} {\hat{z}}_{i j} = z_{i j}^{- 1} x_{i}^{- 1} x_{j} \end{array} \end{matrix}$

The process of optimizing the relative pose error is called pose graph optimization (PGO) and is also called the back-end algorithm of graph-based SLAM. Nodes $x_{i}, x_{i + 1}, a n d \dots$ , which are sequentially calculated by front-end visual odometry (VO) or lidar odometry (LO), are observed Observed when an edge is connected between two non-sequential nodes $x_{i}, x_{j}$ because PGO is not performed because the values and predicted values are identical, but loop closing occurs PGO is performed because the difference between the value and the predicted value occurs.

That is, PGO is generally performed when a special situation such as loop closing occurs. When the robot revisits the same place while moving, the loop detection algorithm operates to determine the loop. At this time, if a loop is detected, the existing node $x_{i}$ and the node $x_{j}$ created by revisiting are connected as loop edges, and various matching algorithms (GICP, NDT , etc...) to create observations. These observations are called virtual measurements because they are virtual observations created by a matching algorithm, not actual observations.

The relative pose error for all nodes on the pose graph can be defined as follows.

$\begin{matrix} (96) & \begin{aligned} E (x) & = \arg min_{x^{*}} \sum_{i} \sum_{j} {‖ e_{i j} ‖}^{2} \\ = \arg min_{x^{*}} \sum_{i} \sum_{j} e_{i j}^{⊺} e_{i j} \end{aligned} \end{matrix}$

${‖ e (x^{*}) ‖}^{2}$ that satisfies $E (x^{*})$ can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating $x$ in small increments $Δ x$ .
$\begin{matrix} (97) & \begin{aligned} E (x + Δ x) & = \arg min_{x^{*}} \sum_{i} \sum_{j} {‖ e_{i j} (x_{i} + Δ x_{i}, x_{j} + Δ x_{j}) ‖}^{2} \end{aligned} \end{matrix}$

Strictly speaking, since the state increment $Δ x$ is an SE(3) transformation matrix, it is correct to add it to the existing state $x$ through the $\oplus$ operator, but for convenience of expression $+$ operator is used.

$\begin{matrix} (98) & \begin{array}{r} e_{i j} (x_{i} \oplus Δ x_{i}, x_{j} \oplus Δ x_{j}) \to e_{i j} (x_{i} + Δ x_{i}, x_{j} + Δ x_{j}) \end{array} \end{matrix}$

The above equation can be expressed as follows through Taylor's first-order approximation.
$\begin{matrix} (99) & \begin{aligned} e_{i j} (x_{i} + Δ x_{i}, x_{j} + Δ x_{j}) & \approx e_{i j} (x_{i}, x_{j}) + J_{i j} [\begin{array}{c} Δ x_{i} \\ Δ x_{j} \end{array}] \\ = e_{i j} (x_{i}, x_{j}) + J_{i} Δ x_{i} + J_{j} Δ x_{j} \\ = e_{i j} (x_{i}, x_{j}) + \frac{\partial e_{i j}}{\partial x_{i}} Δ x_{i} + + \frac{\partial e_{i j}}{\partial x_{j}} Δ x_{j} \end{aligned} \end{matrix}$

$\begin{matrix} (100) & \begin{aligned} E (x + Δ x) & \approx \arg min_{x^{*}} \sum_{i} \sum_{j} {‖ e_{i j} (x_{i} + Δ x_{i}, x_{j} + Δ x_{j}) + J_{i j} [\begin{array}{c} Δ x_{i} \\ Δ x_{j} \end{array}] ‖}^{2} \end{aligned} \end{matrix}$

By differentiating this, the optimal value of increment $Δ x^{*}$ for all nodes is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section.
$\begin{matrix} (101) & \begin{aligned} J^{⊺} J Δ x^{*} = - J^{⊺} e \\ H Δ x^{*} = - b \end{aligned} \end{matrix}$

Since the above equation is in the form of a linear system $Ax = b$ , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find $Δ x^{*}$ . The optimal increment obtained in this way is added to the current state. At this time, whether to update the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system (left) depends on whether the existing $x$ is multiplied to the right or to the left. will lose Since the relative pose error is related to the relative pose of the two nodes, a right multiplication that updates in the local coordinate system is applied.
$\begin{matrix} (102) & \begin{array}{r} x \leftarrow x \oplus Δ x^{*} \end{array} \end{matrix}$

The definition of the right multiplication $\oplus$ operation is as follows.

$\begin{matrix} (103) & \begin{aligned} x \oplus Δ x^{*} & = x Δ x^{*} \\ = x \exp ([Δ ξ^{*}]_{\times}) \dots locally updated (right mult) \end{aligned} \end{matrix}$

5.1. Jacobian of relative pose error

To perform ( $101$ ), the Jacobian $J$ for the relative pose error must be obtained. Given two non-sequential nodes $x_{i}, x_{j}$ , the Jacobian $J_{i j}$ can be expressed as there is.

$\begin{matrix} (104) & \begin{aligned} J_{i j} & = \frac{\partial e_{i j}}{\partial x_{i j}} \\ = \frac{\partial e_{i j}}{\partial [x_{i}, x_{j}]} \\ = [J_{i}, J_{j}] \end{aligned} \end{matrix}$

Here's a closer look at this:

$\begin{matrix} (105) & \begin{aligned} J_{i j} = \frac{\partial e_{i j}}{\partial [x_{i}, x_{j}]} & = \frac{\partial}{\partial [x_{i}, x_{j}]} (z_{i j}^{- 1} {\hat{z}}_{i j}) \\ = \frac{\partial}{\partial [R_{i}, t_{i}, R_{j}, t_{j}]} (z_{i j}^{- 1} {\hat{z}}_{i j}) \end{aligned} \end{matrix}$

5.1.1. Lie theory-based SE(3) optimization

When obtaining the above Jacobian, the position-related term $t$ is a 3-dimensional vector and the magnitude of the vector is equal to 3 degrees of freedom, which is the minimum degree of freedom for expressing a 3-dimensional position. constraints do not exist. On the other hand, the rotation matrix $R$ has 9 parameters, which is more than 3 degrees of freedom, which is the minimum degree of freedom for expressing 3D rotation, so various constraints exist. This is said to be over-parameterized. Disadvantages of over-parameterized representation are as follows.

Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.

$\begin{matrix} (106) & \begin{array}{r} [\begin{array}{c} Δ x_{i}^{*}, Δ x_{j}^{*} \end{array}] \to [Δ ξ_{i}^{*}, Δ ξ_{j}^{*}] \end{array} \end{matrix}$

The Jacobian for $ξ$ is

$\begin{matrix} (107) & \begin{aligned} J_{i j} & = \frac{\partial e_{i j}}{\partial [x_{i}, x_{j}]} & \to \frac{\partial e_{i j}}{\partial [ξ_{i}, ξ_{j}]} \end{aligned} \end{matrix}$

Through this, the existing expression is changed as follows.

$\begin{matrix} (108) & \begin{aligned} e_{i j} (x_{i}, x_{j}) & \to e_{i j} (ξ_{i}, ξ_{j}) \\ E (x) & \to E (ξ) \\ e_{i j} (x_{i}, x_{j}) + J_{i}^{'} Δ x_{i} + J_{j}^{'} Δ x_{j} & \to e_{i j} (ξ_{i}, ξ_{j}) + J_{i} Δ ξ_{i} + J_{j} Δ ξ_{j} \\ H Δ x^{*} = - b & \to H Δ ξ^{*} = - b \\ x \leftarrow Δ x^{*} x & \to x \leftarrow \exp ([Δ ξ^{*}]_{\times}) x \end{aligned} \end{matrix}$

- $J_{i j}^{'} = \frac{\partial e}{\partial [x_{i}, x_{j}]}$

- $J_{i j} = \frac{\partial e}{\partial [ξ_{i}, ξ_{j}]}$

$\exp ([ξ]_{\times}) \in SE (3)$ refers to an operation that transforms twist $ξ$ into a 3D pose by exponential mapping. For more information on exponential mapping, refer to this link.

$\begin{matrix} (109) & \begin{array}{r} \exp ([Δ ξ]_{\times}) = Δ x \end{array} \end{matrix}$

$\frac{\partial}{\partial ξ} (z_{i j}^{- 1} {\hat{z}}_{i j})$ is $z_{i j}^{- 1} S i n c e i t i s n o t i m m e d i a t e l y v i s i b l e i n {\hat{z}}_{i j}$ , you need to change it to a term related to lie algebra.

$\begin{matrix} (110) & \begin{aligned} z_{i j}^{- 1} {\hat{z}}_{i j} & \to Log (z_{i j}^{- 1} {\hat{z}}_{i j}) \end{aligned} \end{matrix}$

At this time, $Log (\cdot)$ means logarithm mapping that changes from SE(3) to twist $ξ \in R^{6}$ . For more information on logarithm mapping, refer to this post . Therefore, the SE(3) version relative pose error $e_{i j} (ξ_{i}, ξ_{j})$ is changed as follows.

$\begin{matrix} (111) & \begin{aligned} e_{i j} (ξ_{i}, ξ_{j}) & = Log (z_{i j}^{- 1} {\hat{z}}_{i j}) \end{aligned} \end{matrix}$

Here's how to break it down in detail:

$\begin{matrix} (112) & \begin{aligned} e_{i j} (ξ_{i}, ξ_{j}) & = Log (z_{i j}^{- 1} {\hat{z}}_{i j}) \\ = Log (z_{i j}^{- 1} x_{i}^{- 1} x_{j}) \\ = Log (\exp ([- ξ_{i j}]_{\times}) \exp ([- ξ_{i}]_{\times}) \exp ([ξ_{j}]_{\times})) \end{aligned} \end{matrix}$

Looking at the above expression, you can see that $ξ_{i}$ and $ξ_{j}$ parameters in $z_{i j}$ are connected by exponential mapping. Applying the left perturbation model to the formula in the second line of the above equation to express the incremental amount is as follows.

$\begin{matrix} (113) & \begin{aligned} e_{i j} (ξ_{i} + Δ ξ_{i}, ξ_{j} + Δ ξ_{j}) & = Log ({\hat{z}}_{i j}^{- 1} x_{i}^{- 1} \exp (- [Δ ξ_{i}]_{\times}) \exp ([Δ ξ_{j}]_{\times}) x_{j}) \end{aligned} \end{matrix}$

In the above equation, the terms are arranged in the form of $e + J Δ ξ$ by moving the increment term to the left or right. To do this, we need to use the following property of the adjoint matrix of SE(3). For more information on Adjoint martix, refer to the this post.

$\begin{matrix} (114) & \begin{array}{r} \exp ([{Ad}_{T} \cdot ξ]_{\times}) = T \cdot \exp ([ξ]_{\times}) \cdot T^{- 1} \end{array} \end{matrix}$

The expression above is transformed into an expression for $T \to T^{- 1}$ as follows.

$\begin{matrix} (115) & \begin{array}{r} \exp ([{Ad}_{T^{- 1}} \cdot ξ]_{\times}) = T^{- 1} \cdot \exp ([ξ]_{\times}) \cdot T \end{array} \end{matrix}$

And rearranging, we get the following formula:

$\begin{matrix} (116) & \begin{array}{r} \exp ([ξ]_{\times}) \cdot T = T \exp ([{Ad}_{T^{- 1}} \cdot ξ]_{\times}) \end{array} \end{matrix}$

( $116$ ) moves the $\exp (\cdot) \exp (\cdot)$ term in the middle of ( $113$ ) to the right or left. This post explains the process of moving to the right. If this is expanded by $Δ ξ_{i}$ and $Δ ξ_{j}$ respectively, it is as follows.

$\begin{matrix} (117) & \begin{aligned} e_{i j} (ξ_{i} + Δ ξ_{i}, ξ_{j}) & = Log ({\hat{z}}_{i j}^{- 1} x_{i}^{- 1} \exp (- [Δ ξ_{i}]_{\times}) x_{j}) \\ = Log (z_{i j}^{- 1} x_{i}^{- 1} x_{j} \exp ([- {Ad}_{x_{j}^{- 1}} Δ ξ_{i}]_{\times})) \dots [1] \\ e_{i j} (ξ_{i}, ξ_{j} + Δ ξ_{j}) & = Log ({\hat{z}}_{i j}^{- 1} x_{i}^{- 1} \exp ([Δ ξ_{j}]_{\times}) x_{j}) \\ = Log (z_{i j}^{- 1} x_{i}^{- 1} x_{j} \exp ([{Ad}_{x_{j}^{- 1}} Δ ξ_{j}]_{\times})) \dots [2] \end{aligned} \end{matrix}$

In order to simplify this expression, $[1]$ and $[2]$ are respectively expressed as follows.

$\begin{matrix} (118) & \begin{array}{r} Log (\exp ([a]_{\times}) \exp ([b]_{\times})) \dots [1] \\ Log (\exp ([a]_{\times}) \exp ([c]_{\times})) \dots [2] \end{array} \end{matrix}$

$\exp ([a]_{\times}) = z_{i j}^{- 1} x_{i}^{- 1} x_{j}$ :Transformation matrix expressed in exponential terms.
- According to the definition of ( $111$ ) above, $a = e_{i j} (ξ_{i}, ξ_{j})$ .
$b = - {Ad}_{x_{j}^{- 1}} Δ ξ_{i}$
$c = {Ad}_{x_{j}^{- 1}} Δ ξ_{j}$

The above equation can be rearranged using the right-hand BCH approximation. The right BCH approximation is

$\begin{matrix} (119) & \begin{aligned} \exp ([ξ]_{\times}) \exp ([Δ ξ]_{\times}) & = \exp ([ξ + J_{r}^{- 1} Δ ξ]_{\times}) \\ \exp ([ξ + Δ ξ]_{\times}) & = \exp ([ξ]_{\times}) \exp ([J_{r} Δ ξ]_{\times}) \end{aligned} \end{matrix}$

For more information, please refer to Chapter 4 of Introduction to Visual SLAM. By using the BCH approximation, ( $118$ ) is summarized as follows.

$\begin{matrix} (120) & \begin{aligned} Log (\exp ([a]_{\times}) \exp ([b]_{\times})) & = Log (\exp ([a + J_{r}^{- 1} b]_{\times})) \\ = a + J_{r}^{- 1} b \dots [1] \\ Log (\exp ([a]_{\times}) \exp ([c]_{\times})) & = Log (\exp ([a + J_{r}^{- 1} c]_{\times})) \\ = a + J_{r}^{- 1} c \dots [2] \end{aligned} \end{matrix}$

Finally, after solving the substitution and rewriting the $Δ ξ_{i}$ and $Δ ξ_{j}$ expressions, the SE(3) version formula of ( $99$ ) is obtained.

$\begin{matrix} (121) & \begin{aligned} e_{i j} (ξ_{i} + Δ ξ_{i}, ξ_{j} + Δ ξ_{j}) & = a + J_{r}^{- 1} b + J_{r}^{- 1} c \\ = e_{i j} (ξ_{i}, ξ_{j}) - J_{r}^{- 1} {Ad}_{x_{j}^{- 1}} Δ ξ_{i} + J_{r}^{- 1} {Ad}_{x_{j}^{- 1}} Δ ξ_{j} \\ = e_{i j} (ξ_{i}, ξ_{j}) + \frac{\partial e_{i j}}{\partial Δ ξ_{i}} Δ ξ_{i} + \frac{\partial e_{i j}}{\partial Δ ξ_{j}} Δ ξ_{j} \end{aligned} \end{matrix}$

Therefore, the SE(3) version Jacobian of the final relative pose error is as follows.

$\begin{matrix} (122) & \begin{aligned} \frac{\partial e_{i j}}{\partial Δ ξ_{i}} = - J_{r}^{- 1} {Ad}_{x_{j}^{- 1}} \in R^{6 \times 6} \\ \frac{\partial e_{i j}}{\partial Δ ξ_{j}} = J_{r}^{- 1} {Ad}_{x_{j}^{- 1}} \in R^{6 \times 6} \end{aligned} \end{matrix}$

At this time, since $J_{r}^{- 1}$ is a complicated expression, it is generally used by approximating it as follows, or it is used by setting it as $I_{6}$ .

$\begin{matrix} (123) & \begin{array}{r} J_{r}^{- 1} \approx I_{6} + \frac{1}{2} [\begin{array}{c} [w]_{\times} & [v]_{\times} \\ 0 & [w]_{\times} \end{array}] \in R^{6 \times 6} \end{array} \end{matrix}$

If $J_{r}^{- 1} = I_{6}$ and optimization is performed, there is a reduction in the amount of computation, but the optimization performance is similar to the above Jacobian. The method used has a slight predominance. For more information, please refer to Chapter 11 of Introduction to Visual SLAM.

5.1.2. Code implementations

g2o code: edge_se3_expmap.cpp#L55
- In the g2o code above, the error is $e_{i j} = x_{j}^{- 1} z_{i j} x_{i}$ , which makes the Jacobian slightly different from the description above.
- $\frac{\partial e_{i j}}{\partial Δ ξ_{i}} = J_{l}^{- 1} {Ad}_{x_{j}^{- 1} z_{i j}}$
- $\frac{\partial e_{i j}}{\partial Δ ξ_{j}} = - J_{r}^{- 1} {Ad}_{x_{i}^{- 1} z_{i j}^{- 1}}$
- This is the same format as in the ( $117$ ) expression where $Δ ξ_{i}$ is arranged by passing terms to the left, and $Δ ξ_{j}$ is arranged by passing terms to the right and merged.
- It also seems to be an approximate value for $J_{l}^{- 1} \approx I_{6}, J_{r}^{- 1} \approx I_{6}$ . Therefore, the actual implemented code is as follows.
  - $\frac{\partial e_{i j}}{\partial Δ ξ_{i}} \approx {Ad}_{x_{j}^{- 1} z_{i j}}$
  - $\frac{\partial e_{i j}}{\partial Δ ξ_{j}} \approx - {Ad}_{x_{i}^{- 1} z_{i j}^{- 1}}$

6. Line reprojection error

Line reprojection error is an error used when optimizing a line in 3D space expressed in plücker coordinates. For more information on Plücker coordinates, see the Plücker Coordinate 개념 정리 post.

NOMENCLATURE of line reprojection error

$T_{c w} \in R^{6 \times 6}$ : Transformation matrix of Plücker lines
$K_{L}$ : line intrinsic matrix
$U \in S O (3)$ : Rotation matrix of a 3D line
$W \in S O (2)$ : A matrix containing information on the distance a 3D line is from the origin
$θ \in R^{3}$ : Parameter corresponding to rotation matrix SO(3)
$θ \in R$ : Parameter corresponding to rotation matrix SO(2)
$u_{i}$ : $i$ th column vector
$X = [δ_{θ}, δ_{ξ}]$ : State variable
$δ_{θ} = [θ^{⊺}, θ] \in R^{4}$ : State variables in orthonormal representation
$δ_{ξ} = [δ ξ] \in s e (3)$ : For the update method through Lie theory, refer to this link
$\oplus$ : An operator that can update the state variable $δ_{θ}, δ_{ξ}$ in one step.
$J = \frac{\partial e_{l}}{\partial X} = \frac{\partial e_{l}}{\partial [δ_{θ}, δ_{ξ}]}$

A line in 3D space can be expressed as a 6D column vector using Plücker coordinates.

$\begin{matrix} (124) & \begin{aligned} L & = [m^{⊺} : d^{⊺}]^{⊺} = [m_{x} : m_{y} : m_{z} : d_{x} : d_{y} : d_{z}]^{⊺} \end{aligned} \end{matrix}$

Unlike the $[d : m]$ order described above, most papers using Plücker Coordinate use the $[m : d]$ order, so this section Also, lines are expressed in the corresponding order. Since the linear expression method has scale ambiguity, it has 5 degrees of freedom, and even if $m$ and $d$ are not unit vectors, a line can be uniquely expressed by the ratio of the values of the two vectors.

6.1. Line Transformation and projection

If $L_{w}$ is a line viewed from the world coordinate system, it can be converted as follows when viewed from the camera coordinate system.

$\begin{matrix} (125) & \begin{aligned} L_{c} & = [\begin{array}{c} m_{c} \\ d_{c} \end{array}] = T_{c w} L_{w} = [\begin{array}{c} R_{c w} & t^{\land} R_{c w} \\ 0 & R_{c w} \end{array}] [\begin{array}{c} m_{w} \\ d_{w} \end{array}] \end{aligned} \end{matrix}$

- $T_{c w} \in R^{6 \times 6}$ :Transformation matrix for the Plücker line

Projecting the line onto the image plane gives:

$\begin{matrix} (126) & \begin{aligned} l_{c} & = [\begin{array}{c} l_{1} \\ l_{2} \\ l_{3} \end{array}] = K_{L} m_{c} = [\begin{array}{c} f_{y} \\ f_{x} \\ - f_{y} c_{x} & - f_{x} c_{y} & f_{x} f_{y} \end{array}] [\begin{array}{c} m_{x} \\ m_{y} \\ m_{z} \end{array}] \end{aligned} \end{matrix}$

- $K_{L}$ : 직선의 내부 파라미터 행렬(line intrinsic matrix)

$K_{L}$ means $P = [det (N) N^{- ⊺} | n^{\land} N]$ to $P = K [I | 0]$ . So $P = [det (K) K^{- ⊺} | 0]$ , so the $d$ term of $L$ is cleared to 0. Therefore, when $K = [\begin{matrix} f_{x} & c_{x} \\ f_{y} & c_{y} \\ 1 \end{matrix}]$ , the following expression is derived.

$\begin{matrix} (127) & \begin{array}{r} K_{L} = det (K) K^{- ⊺} = [\begin{array}{c} f_{y} \\ f_{x} \\ - f_{y} c_{x} & - f_{x} c_{y} & f_{x} f_{y} \end{array}] \in R^{3 \times 3} \end{array} \end{matrix}$

6.2. Line reprojection error

The reprojection error $e_{l}$ of a line can be expressed as follows.

$\begin{matrix} (128) & \begin{array}{r} e_{l} = [\begin{array}{c} d_{s}, d_{e} \end{array}] = [\begin{array}{c} \frac{x_{s}^{⊺} l_{c}}{\sqrt{l_{1}^{2} + l_{2}^{2}}}, & \frac{x_{e}^{⊺} l_{c}}{\sqrt{l_{1}^{2} + l_{2}^{2}}} \end{array}] \in R^{2} \end{array} \end{matrix}$

This can be expressed through the formula for the distance between a point and a line. At this time, ${x_{s}$ and $x_{e}}$ mean the start and end points of the line extracted using line feature extracture (e.g., LSD), respectively. . In other words, $l_{c}$ is the predicted value obtained through modeling, and the line connecting $x_{s}$ and $x_{e}$ is measured through sensor data. becomes an observation.

6.3. Orthonormal representation

A problem arises if the Plücker Coordinate expression is used as it is when BA optimization is performed using $e_{l}$ obtained above. Since the Plücker Coordinate always has 5 degrees of freedom because it must satisfy the Klein quadric constraint of $m^{⊺} d = 0$ , the minimum number of parameters that can express a line is 4 It is over-parameterized. The disadvantages of the over-parameterized expression method are as follows.

Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.

Therefore, when optimizing a line, an orthonormal expression is generally used to change the minimum parameter to four degrees of freedom. In other words, when expressing a line, Plücker Coordinate is used, but when performing optimization, it transforms into orthonormal expression, updates the optimal value, and returns to Plücker Coordinate.

Orthonormal expression is as follows. A line in 3D space can always be expressed as:

$\begin{matrix} (129) & \begin{array}{r} (U, W) \in S O (3) \times S O (2) \end{array} \end{matrix}$

Any Plücker line $L = [m^{⊺} : d^{⊺}]^{⊺}$ always has a one-to-one correspondence $(U, W)$ , and this representation method is called orthonormal representation. Given a line $L_{w} = [m_{w}^{⊺} : d_{w}^{⊺}]^{⊺}$ in the world, $(U, W)$ can be obtained by performing QR decomposition of $L_{w}$ .

$\begin{matrix} (130) & \begin{array}{r} [\begin{array}{c} m_{w} | d_{w} \end{array}] = U [\begin{array}{c} w_{1} & 0 \\ 0 & w_{2} \\ 0 & 0 \end{array}], with set: W = [\begin{array}{c} w_{1} & - w_{2} \\ w_{2} & w_{1} \end{array}] \end{array} \end{matrix}$

At this time, the element $(1, 2)$ of the upper triangle matrix $R$ always becomes 0 due to the Plücker constraint (Klein quadric). Since $U$ and $W$ mean 3D and 2D rotation matrices, respectively, $U = R (θ), W = R (θ)$ .

$\begin{matrix} (131) & \begin{aligned} R (θ) & = U = [\begin{array}{c} u_{1} & u_{2} & u_{3} \end{array}] = [\begin{array}{c} \frac{m_{w}}{‖ m_{w} ‖} & \frac{d_{w}}{‖ d_{w} ‖} & \frac{m_{w} \times d_{w}}{‖ m_{w} \times d_{w} ‖} \end{array}] \\ R (θ) & = W = [\begin{array}{c} w_{1} & - w_{2} \\ w_{2} & w_{1} \end{array}] = [\begin{array}{c} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{array}] \\ = \frac{1}{\sqrt{{‖ m_{w} ‖}^{2} + {‖ d_{w} ‖}^{2}}} [\begin{array}{c} ‖ m_{w} ‖ & ‖ d_{w} ‖ \\ - ‖ d_{w} ‖ & ‖ m_{w} ‖ \end{array}] \end{aligned} \end{matrix}$

When performing real optimizations, they are updated like $U \leftarrow U R (θ), W \leftarrow W R (θ)$ . So the orthonormal representation is a line in 3D space $δ_{θ} = [θ^{⊺}, θ] \in R^{4}$ can be expressed with four degrees of freedom. $[θ^{⊺}, θ]$ updated through optimization is converted to $L_{w}$ as follows.

$\begin{matrix} (132) & \begin{array}{r} L_{w} = [\begin{array}{c} w_{1} u_{1}^{⊺} & w_{2} u_{2}^{⊺} \end{array}] \end{array} \end{matrix}$

6.4. Error function formulation

In order to optimize the reprojection error $e_{l}$ for a line, iteratively using a nonlinear least squares method such as Gauss-Newton (GN) or Levenberg-Marquardt (LM) to find the optimal variable. need to update The expression of the error function using the reprojection error is as follows.

$\begin{matrix} (133) & \begin{aligned} E_{l} (X) & = \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e_{l, i j} ‖}^{2} \\ = \arg min_{X^{*}} \sum_{i} \sum_{j} e_{l, i j}^{⊺} e_{l, i j} \end{aligned} \end{matrix}$

The ${‖ e_{l} (X^{*}) ‖}^{2}$ satisfying $E_{l} (X^{*})$ can be calculated iteratively through non-linear least squares. The optimal state is found by iteratively updating $X$ in small increments $Δ X$ .
$\begin{matrix} (134) & \begin{aligned} E_{l} (X + Δ X) & = \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e_{l} (X + Δ X) ‖}^{2} \end{aligned} \end{matrix}$

Strictly speaking, since the state increment $Δ X$ includes the SE(3) transformation matrix, it is correct to add it to the existing state $X$ through the $\oplus$ operator, but for convenience of expression The $+$ operator is used.

$\begin{matrix} (135) & \begin{array}{r} e_{l} (X \oplus Δ X) \to e_{l} (X + Δ X) \end{array} \end{matrix}$

The above equation can be expressed as follows through Taylor's first-order approximation.
$\begin{matrix} (136) & \begin{aligned} e_{l} (X + Δ X) & \approx e_{l} (X) + J Δ X \\ = e_{l} (X) + J_{θ} Δ δ_{θ} + J_{ξ} Δ δ_{ξ} \\ = e_{l} (X) + \frac{\partial e_{l}}{\partial δ_{θ}} Δ δ_{θ} + \frac{\partial e_{l}}{\partial δ_{ξ}} Δ δ_{ξ} \end{aligned} \end{matrix}$

$\begin{matrix} (137) & \begin{aligned} E_{l} (X + Δ X) & \approx \arg min_{X^{*}} \sum_{i} \sum_{j} {‖ e_{l} (X) + J Δ X ‖}^{2} \end{aligned} \end{matrix}$

$\begin{matrix} (138) & \begin{aligned} J^{⊺} J Δ X^{*} = - J^{⊺} e \\ H Δ X^{*} = - b \end{aligned} \end{matrix}$

6.4.1. The analytical jacobian of 3d line

As explained in the previous section, to perform nonlinear optimization we need to compute $J$ . $J$ consists of:

$\begin{matrix} (139) & \begin{aligned} J = [J_{θ}, J_{ξ}] \end{aligned} \end{matrix}$

$[J_{θ}, J_{ξ}]$ can be expanded as follows:

$\begin{matrix} (140) & \begin{aligned} J_{θ} = \frac{\partial e_{l}}{\partial δ_{θ}} = \frac{\partial e_{l}}{\partial l} \frac{\partial l}{\partial L_{c}} \frac{\partial L_{c}}{\partial L_{w}} \frac{\partial L_{w}}{\partial δ_{θ}} \\ J_{ξ} = \frac{\partial e_{l}}{\partial δ_{ξ}} = \frac{\partial e_{l}}{\partial l} \frac{\partial l}{\partial L_{c}} \frac{\partial L_{c}}{\partial δ_{ξ}} \end{aligned} \end{matrix}$

$\frac{\partial e_{l}}{\partial l}$ can be obtained as follows. At this time, note that $l$ is a vector and $l_{i}$ is a scalar.

$\begin{matrix} (141) & \begin{array}{r} \frac{\partial e_{l}}{\partial l} = \frac{1}{\sqrt{l_{1}^{2} + l_{2}^{2}}} [\begin{array}{c} x_{s} - \frac{l_{1} x_{s} l}{\sqrt{l_{1}^{2} + l_{2}^{2}}} & y_{s} - \frac{l_{2} x_{s} l}{\sqrt{l_{1}^{2} + l_{2}^{2}}} & 1 \\ x_{e} - \frac{l_{1} x_{e} l}{\sqrt{l_{1}^{2} + l_{2}^{2}}} & y_{e} - \frac{l_{2} x_{e} l}{\sqrt{l_{1}^{2} + l_{2}^{2}}} & 1 \end{array}] \in R^{2 \times 3} \end{array} \end{matrix}$

$\frac{\partial l}{\partial L_{c}}$ can be obtained as follows.

$\begin{matrix} (142) & \begin{array}{r} \frac{\partial l}{\partial L_{c}} = \frac{\partial K_{L} m_{c}}{\partial L_{c}} = [\begin{array}{c} K_{L} & 0_{3 \times 3} \end{array}] = [\begin{array}{c} f_{y} & 0 & 0 & 0 \\ f_{x} & 0 & 0 & 0 \\ - f_{y} c_{x} & - f_{x} c_{y} & f_{x} f_{y} & 0 & 0 & 0 \end{array}] \in R^{3 \times 6} \end{array} \end{matrix}$

$\frac{\partial L_{c}}{\partial L_{w}}$ can be obtained as follows.

$\begin{matrix} (143) & \begin{array}{r} \frac{\partial L_{c}}{\partial L_{w}} = \frac{\partial T_{c w} L_{w}}{\partial L_{w}} = T_{c w} = [\begin{array}{c} R_{c w} & t^{\land} R_{c w} \\ 0 & R_{c w} \end{array}] \in R^{6 \times 6} \end{array} \end{matrix}$

The Jacobian $\frac{\partial L_{w}}{\partial δ_{θ}}$ for the orthonormal representation can be obtained as follows:

$\begin{matrix} (144) & \begin{array}{r} \frac{\partial L_{w}}{\partial δ_{θ}} = [\begin{array}{c} 0_{3 \times 1} & - w_{1} u_{3} & w_{1} u_{2} & - w_{2} u_{1} \\ w_{2} u_{3} & 0_{3 \times 1} & - w_{2} u_{1} & w_{1} u_{2} \end{array}] \in R^{6 \times 4} \end{array} \end{matrix}$

The Jacobian $\frac{\partial L_{c}}{\partial δ_{ξ}}$ for the camera pose can be obtained as follows:

$\begin{matrix} (145) & \begin{array}{r} \frac{\partial L_{c}}{\partial δ_{ξ}} = [\begin{array}{c} - (R m)^{\land} - (t^{\land} Rd)^{\land} & - (Rd)^{\land} \\ - (Rd)^{\land} & 0_{3 \times 3} \end{array}] \in R^{6 \times 6} \end{array} \end{matrix}$

6.4.2. Code implementations

Structure PLP SLAM code: g2o/se3/pose_opt_edge_line3d_orthonormal.h#L62
Structure PLP SLAM code2: g2o/se3/pose_opt_edge_line3d_orthonormal.h#L81

References

[1] [Blog] [SLAM] Bundle Adjustment 개념 리뷰: Reprojection error

[2] [Blog] [SLAM] Optical Flow와 Direct Method 개념 및 코드 리뷰: Photometric error

[3] [Blog] [SLAM] Pose Graph Optimization 개념 설명 및 예제 코드 분석: Relative pose error

[4] [Blog] Plücker Coordinate 개념 정리: Line projection error

저작자표시

'English' 카테고리의 다른 글

[En] Notes on Lie Theory (SO(3), SE(3)) (0)	2023.01.17
[En] Notes on Plücker Coordinate (0)	2023.01.17
[SLAM][En] Errors and Jacobian Derivations for SLAM Part 2 (0)	2023.01.17
[En] Vim - Useful plugins and C++/Python development environment (1)	2023.01.16
[SLAM][En] Notes on Formula Derivation and Analysis of the VINS-mono (0)	2023.01.16

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

A L I D A

[SLAM][En] Errors and Jacobian Derivations for SLAM Part 1

1. Introduction

2. Optimization formulation

2.1. Error derivation

2.2. Error function derivation

2.3. Non-linear least squares

3. Reprojection error

3.1. Jacobian of the reprojection error

3.1.1. Jacobian of camera pose

3.1.2. Lie theory-based SO(3) optmization

3.1.3. Code implementations

3.2. Jacobian of Map Point

3.2.1. Code implementations

4. Photometric error

4.1. Jacobian of the photometric error

4.1.1. Lie theory-based SE(3) optimization

4.1.2. Code implementations

5. Relative pose error (PGO)

5.1. Jacobian of relative pose error

5.1.1. Lie theory-based SE(3) optimization

5.1.2. Code implementations

6. Line reprojection error

6.2. Line reprojection error

6.3. Orthonormal representation

6.4. Error function formulation

6.4.1. The analytical jacobian of 3d line

6.4.2. Code implementations

References

'English' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

[SLAM][En] Errors and Jacobian Derivations for SLAM Part 1

1. Introduction

2. Optimization formulation

2.1. Error derivation

2.2. Error function derivation

2.3. Non-linear least squares

3. Reprojection error

3.1. Jacobian of the reprojection error

3.1.1. Jacobian of camera pose

3.1.2. Lie theory-based SO(3) optmization

3.1.3. Code implementations

3.2. Jacobian of Map Point

3.2.1. Code implementations

4. Photometric error

4.1. Jacobian of the photometric error

4.1.1. Lie theory-based SE(3) optimization

4.1.2. Code implementations

5. Relative pose error (PGO)

5.1. Jacobian of relative pose error

5.1.1. Lie theory-based SE(3) optimization

5.1.2. Code implementations

6. Line reprojection error

6.2. Line reprojection error

6.3. Orthonormal representation

6.4. Error function formulation

6.4.1. The analytical jacobian of 3d line

6.4.2. Code implementations

References

'English' 카테고리의 다른 글

'English' Related Articles

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역