[SLAM][En] Errors and Jacobian Derivations for SLAM Part 1
alida2023. 1. 17. 10:17
1. Introduction
This post explains the definition of various errors used in SLAM and the Jacobian derivations for nonlinear optimization.
The errors covered in this post are:
Reprojection error
Photometric error
Relative pose error (PGO)
Line reprojection error
IMU measurement error: TBA
Different Jacobians are derived depending on whether the camera pose is expressed as a rotation matrix or a transformation matrix . To obtain the two Jacobians, we derive the Jacobian for SO(3) in case of reprojection error and the Jacobian for SE(3) in case of photometric error. Points in 3D space also have different Jacobians depending on how is expressed and inverse depth is expressed. . The Jacobian derivation process for both cases is also described.
The Jacobians covered in this post are:
Camera pose (SO(3)-based)
Camera pose (SE(3)-based)
Map point
Relative pose (SE(3)-based)
3Dplücker line
Quaternion representation
Camera intrinsics
Inverse depth
IMU error-state system kinematics : TBA
IMU measurement : TBA
2. Optimization formulation
2.1. Error derivation
In SLAM, error is defined as the difference between an observed value (measurement) based on sensor data and an estimated value based on mathematical modeling.
- : state variable for the model
As above, the difference between the observed value and the predicted value is set as the error, and the optimal state variable that minimizes the error is calculated as an optimization problem in SLAM. At this time, since the state variables of SLAM in general include non-linear terms related to rotation, the non-linear least squares method is mainly used.
2.2. Error function derivation
In general, when a large amount of sensor data comes in, dozens to hundreds of errors are calculated in the form of vectors. At this time, it is assumed that the error follows a normal distribution, and the work of transforming it into an error function is performed.
The multivariate normal distribution for modeling the error function is
- : information matrix (inverse of covariance matrix)
We can model the error as a multivariate normal distribution with mean and variance . obtained by applying log-likelihood to the expression is as follows.
If you find the where the log-likelihood is maximized, the probability of the multivariate normal distribution is maximized. This is called Maximum Liklihood Estimation (MLE). Since has a negative number (-) in front of it, if you find that minimizes the negative log-likelihood, you get:
When all errors are added together, instead of a single error, it is expressed as follows, and this is called the error function . In an actual optimization problem, we find that minimizes the error function rather than a single error .
2.3. Non-linear least squares
The final optimization equation to be solved is:
In the above formula, we need to find the optimal parameter that minimizes the error. However, the above formula does not have a closed-form solution because it usually includes non-linear terms for rotation in SLAM. Therefore, it is necessary to solve the problem using nonlinear optimization methods (Gauss-Newton (GN), Levenberg-Marquardt (LM)). Among the actual implemented SLAM codes, information matrix is often set to to find the optimal value for .
For example, suppose we solve the problem using the GN method. The order of solving the problem is as follows.
define the error function
Approximate linearization with Taylor expansion
Set to 0 after the first derivative.
At this time, find the value and substitute it into the error function.
Repeat until the values converge.
The detailed expression of the error function is equivalent to , which means that the value of the error function varies depending on the robot's pose vector . do. The GN method iteratively updates the incremental amount in the direction of decreasing error in .
At this time, if is used for the first-order Taylor expansion around , the above expression is approximated as follows.
In this case, . Applying this to the entire error function gives:
Expanding the above expression and substituting it gives:
Applying this to the total error gives:
is the quadratic form of and is a positive definite matrix, the first derivative of is reduced to zero. The set value becomes the minimum of .
Summarizing this, the following formula is derived:
obtained in this way is updated to .
The algorithm that performs the process iteratively so far is called the Gauss-Newton method. Compared to the GN method, the LM method has the same overall process, but a damping factor term is added in the formula for calculating the increment.
3. Reprojection error
Reprojection error is an error mainly used in feature-based Visual SLAM. It is mainly used when performing visual odometry (VO) or bundle adjustment (BA) based on feature-based methods. For more information about BA, see [SLAM] Bundle Adjustment 개념 리뷰 post
NOMENCLATURE of reprojection error
A non-homogeneous transformation of the point in 3-dimensional space for projection onto the image plane.
A point projected onto the image plane after correcting for lens distortion. If distortion correction has already been performed at the input stage, .
: Camera intrinsic matrix
: I omitted the last line of internal parameters to project to .
: State variable of model
: number of camera poses
: number of 3D points
: is sometimes omitted for brevity.
: The pixel coordinates of the observed load
: Pixel Coordinates of Estimated Feature Points
: Transform the 3D point into the camera coordinate system ,
: Operator that can update the rotation matrix , three-dimensional vectors at once.
: Angular velocity
: Skew-symmetric matrix of angular velocity
is projected onto the image plane through the following transformation.
A model that utilizes intrinsic/extrinsic parameters of the camera as above is called a projection model. The reprojection error through this is defined as follows.
The error function for all camera poses and 3D points is defined as
that satisfies can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating in small increments .
Strictly speaking, since the state increment includes the SO(3) rotation matrix, it is correct to add it to the existing state through the operator, but for convenience of expression The operator is used.
The above equation can be expressed as follows through Taylor's first-order approximation.
By differentiating this, the optimal value of increment is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section
Since the above equation is in the form of a linear system , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find . At this time, and out of the existing states exist in the linear vector space, so there is no difference whether adding from the right side or from the left side. However, since the rotation matrix belongs to the nonlinear SO(3) family, depending on whether it is multiplied by the right or left, the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system is updated. (left) will change. Reprojection errors update the global coordinate system's transformation matrix, so we usually use the left multiplication method.
Since is composed of , it can be written as follows.
The definition of the left multiplication operation is as follows.
3.1. Jacobian of the reprojection error
3.1.1. Jacobian of camera pose
The Jacobian for a pose can be decomposed as:
By rearranging the above equation using the chain rule, we get: At this time, for convenience, it is expressed as .
At this time, the reason why the Jacobian for the angular velocity instead of the Jacobian for the rotation matrix will be explained in the next section. In addition, the sign of is also changed depending on whether the error is defined as or , so this should be applied with care when implementing the actual code.In the data, the sign was considered and marked as .
Assuming that undistortion has already been performed during the image input process, is as follows.
Next, is:
Next, we need to find . This can be obtained relatively simply as follows.
3.1.2. Lie theory-based SO(3) optmization
Finally, we need to find . At this time, rotation-related parameters are expressed as angular velocity rather than rotation matrix . The rotation matrix is over-parameterized because the number of parameters is 9, but the actual rotation is limited to 3 degrees of freedom. Disadvantages of over-parameterized representation are as follows.
Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.
Lie theory allows optimization to be free from constraints. Therefore, by using lie algebra so(3) instead of lie group SO(3) , parameters can be updated freely from constraints.
However, since is not immediately visible in , must be expressed as lie algebra.
At this time, since we need to find the Jacobian for the term related to rotation, we translate the 3D point by and Let be the point where the at the same position is rotated by .
transforms the angular velocity into a 3D rotation matrix by exponential mapping refers to an operation that For more information on exponential mapping, refer to this link.
At this time, there are two ways to update the small lie algebra increment to the existing . First of all, there is an update method using basic lie algebra. Next, there is an update method using the perturbation model.
The relationship between the above two methods is as follows. For details, refer to chapter 4.3.3 of this link.
Lie algebra-based update: First of all, if you directly calculate using the method, you will get the following A complex diet is induced.
In the above equation, the second row is a form in which the left Jacobian is derived using the BCH approximation, and the third row is a form in which the first-order Taylor approximation is applied for a small rotation amount . For more information about , refer toChapter 4 of Introduction of Visual SLAM
To understand the approximation of the third row, given an arbitrary rotation vector , the rotation matrix is exponential When developed in mapping form, it can be expressed as follows.
For a small-sized rotation matrix , it can be approximated as follows by ignoring higher-order terms of order 2 or higher.
Perturbation model-based update: In order to obtain a simpler Jacobian without using , the perturbation model of lie algebra so(3) is generally used. The Jacobian is obtained using the perturbation model as follows:
The above equation also uses approximate for a small rotation matrix in the second row. Therefore, when using the perturbation model, there is an advantage in that the Jacobian can be obtained simply by using an skew-symmetric matrix of the points in the 3D space. In the case of reprojection error optimization, since most of the errors for the feature points of sequentially incoming images are optimized, the camera pose change is not large and the size of is not large, so the Jacobian above is generally used. Since the method is used, when updating the small increment to the existing rotation matrix , it is updated as ().
So the original Jacobian is to , which looks like this:
Photometric error is an error mainly used in direct Visual SLAM. It is mainly used when performing direct method-based visual odometry (VO) or bundle adjustment (BA). For more information about direct methods, see [SLAM] Optical Flow와 Direct Method 개념 및 코드 리뷰 post.
NOMENCLATURE of photometric error
A non-homogeneous transformation of the point in 3-dimensional space for projection onto the image plane.
Projection onto the image plane after correcting for lens distortion. If distortion correction has already been performed at the input stage, then .
: Camera intrinsic matrix
: I omitted the last line of internal parameters to project to
: The set of all feature points in the image
: In general, it is sometimes indicated simply by omitting the notation.
: Pixel coordinates of the ith feature point in the first image and the second image
: operator for composition of two SE(3) group
: Transform 3D point to camera coordinate system,
: A vector of three-dimensional angular velocity and velocity. It's called a twist.
: Twist's lie algebra with hat operator applied (4x4 matrix)
: Jacobian for left multiplication. Since it is not used in actual calculations, it will not be introduced in detail.
In the figure above, the world coordinates of the 3D point are and the corresponding two cameras The pixel coordinates on the image plane are . At this time, it is assumed that the internal parameters of the two cameras and are the same. When camera is the origin(), if pixel coordinates are expressed through the 3D point , they are projected in the following order.
One of the characteristics of direct method is that unlike feature-based, there is no way to know which matches . So based on the current pose estimate, we find the position of . That is, and are made similar by optimizing the pose of the camera. At this time, the problem is solved by minimizing the photometric error. The photometric error is:
The photometric error is based on the assumption of grayscale invariance and is scalar. To solve non-linear least squares with photometric error, we can define the following error function .
that satisfies can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating in small increments .
Strictly speaking, since the state increment is an SE(3) conversion matrix, it is correct to add it to the existing state through the operator, but for convenience of expression, The operator is used.
This can be expressed as follows through the first-order Taylor approximation.
By differentiating this, the optimal value of increment is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section.
Since the above equation is in the form of a linear system , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find . The optimal increment obtained in this way is added to the current state. At this time, whether to update the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system (left) depends on whether the existing is multiplied to the right or to the left. will lose Photometric errors update the transformation matrix in the global coordinate system, so the left multiplication method is generally used.
The definition of the left multiplication operation is as follows.
4.1. Jacobian of the photometric error
To perform (), we need to find the Jacobian for the photometric error. This can be expressed as:
Here's a closer look at this:
Re-expressing the above expression by applying the chain rule gives:
At this time, the Jacobian for the transformation matrix is not obtained, but for the twist The reason for finding the Jacobian is explained in the next section. First of all, means the gradient of the image.
Assuming that undistortion has already been performed during the image input process, is as follows.
Next, is:
4.1.1. Lie theory-based SE(3) optimization
Finally . At this time, since the term related to the position is a 3D vector and the size of the vector is equal to 3 degrees of freedom, which is the minimum degree of freedom for expressing the 3D position, a separate constraint is required when performing the optimization update. does not exist. On the other hand, the rotation matrix has 9 parameters, which is more than 3 degrees of freedom, which is the minimum degree of freedom for expressing 3D rotation, so various constraints exist. This is said to be over-parameterized. Disadvantages of over-parameterized representation are as follows.
Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.
Therefore, a lie theory-based optimization method, which is a minimal parameter expression free from constraints, is generally used. Instead of calculating , which includes a nonlinear rotation matrix, the lie group SE(3)-based optimization method includes , and position-related terms are changed to to find the optimal lie algebra se(3) , and then exponential mapping Indicates how to update to SE(3).
The Jacobian for is
Through this, the existing expression is changed as follows.
-
-
refers to an operation that transforms twist into a 3D pose by exponential mapping. For more information on exponential mapping, refer to this link.
So far the Jacobians have been easy to compute, whereas is because the parameter is not immediately visible in should be changed to terms related to lie algebra.
At this time, there are two ways to update the small lie algebra increment to the existing . First of all, [1] There is an update method using basic lie algebra. [2] Next, there is an update method using a perturbation model.
Among the two methods above, the method adds the fine increment to the existing and then performs exponential mapping to obtain the Jacobian. The method calculates the Jacobian. This method updates the existing state by multiplying the left side of by the perturbation model . The following conversion exists between the two methods, which is called the BCH approximation. For details, see Chapter 4 of Introduction to Visual SLAM.
Since using the method leads to very complex equations, this method is not often used, and the perturbation model of is mainly used. Therefore, is transformed as follows.
The Jacobian for can be calculated as
Therefore, when using the perturbation model, there is an advantage in that the Jacobian can be obtained simply by using the skew-symmetric matrix of the points in the 3D space. In the case of photometric error optimization, since most of the errors for the brightness change of sequentially incoming images are optimized, the camera pose change is not large and the size of is not large, so the above Jacobian is generally used. Since we use the perturbation model, the small increment is updated as ().
In the above equation, the second row is the first-order Taylor approximation applied to the small twist increment . To understand the approximation of the second row, the transformation matrix is exponential given an arbitrary twist When developed in mapping form, it can be expressed as follows.
For a small twist increment , it can be approximated as follows by ignoring higher-order terms of the second or higher order.
Finally, The Jacobian for the pose is:
At this time, since the last line of is always 0, it is sometimes omitted and calculated.
:For convenience of expression, it is sometimes omitted.
: operator for composition of two SE(3) families
: Operator that converts SE(3) to twist . For more information on logarithm mapping, refer tothis post.
Given two nodes on the Pose graph, the relative pose (=observed value) newly calculated by the sensor data and the existing known The difference in relative pose (=predicted value) is defined as the relative pose error. (figure: Freiburg univ. Robot Mapping Course).
The process of optimizing the relative pose error is called pose graph optimization (PGO) and is also called the back-end algorithm of graph-based SLAM. Nodes , which are sequentially calculated by front-end visual odometry (VO) or lidar odometry (LO), are observed Observed when an edge is connected between two non-sequential nodes because PGO is not performed because the values and predicted values are identical, but loop closing occurs PGO is performed because the difference between the value and the predicted value occurs.
That is, PGO is generally performed when a special situation such as loop closing occurs. When the robot revisits the same place while moving, the loop detection algorithm operates to determine the loop. At this time, if a loop is detected, the existing node and the node created by revisiting are connected as loop edges, and various matching algorithms (GICP, NDT , etc...) to create observations. These observations are called virtual measurements because they are virtual observations created by a matching algorithm, not actual observations.
The relative pose error for all nodes on the pose graph can be defined as follows.
that satisfies can be computed iteratively through non-linear least squares. The optimal state is found by iteratively updating in small increments .
Strictly speaking, since the state increment is an SE(3) transformation matrix, it is correct to add it to the existing state through the operator, but for convenience of expression operator is used.
The above equation can be expressed as follows through Taylor's first-order approximation.
By differentiating this, the optimal value of increment for all nodes is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section.
Since the above equation is in the form of a linear system , various linear algebra techniques such as schur complement and cholesky decomposition can be used to find . The optimal increment obtained in this way is added to the current state. At this time, whether to update the pose seen in the local coordinate system (right) or the pose seen in the global coordinate system (left) depends on whether the existing is multiplied to the right or to the left. will lose Since the relative pose error is related to the relative pose of the two nodes, a right multiplication that updates in the local coordinate system is applied.
The definition of the right multiplication operation is as follows.
5.1. Jacobian of relative pose error
To perform (), the Jacobian for the relative pose error must be obtained. Given two non-sequential nodes , the Jacobian can be expressed as there is.
Here's a closer look at this:
5.1.1. Lie theory-based SE(3) optimization
When obtaining the above Jacobian, the position-related term is a 3-dimensional vector and the magnitude of the vector is equal to 3 degrees of freedom, which is the minimum degree of freedom for expressing a 3-dimensional position. constraints do not exist. On the other hand, the rotation matrix has 9 parameters, which is more than 3 degrees of freedom, which is the minimum degree of freedom for expressing 3D rotation, so various constraints exist. This is said to be over-parameterized. Disadvantages of over-parameterized representation are as follows.
Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.
Therefore, a lie theory-based optimization method, which is a minimal parameter expression free from constraints, is generally used. Instead of calculating , which includes a nonlinear rotation matrix, the lie group SE(3)-based optimization method includes , and position-related terms are changed to to find the optimal lie algebra se(3) , and then exponential mapping Indicates how to update to SE(3).
The Jacobian for is
Through this, the existing expression is changed as follows.
-
-
refers to an operation that transforms twist into a 3D pose by exponential mapping. For more information on exponential mapping, refer to this link.
is , you need to change it to a term related to lie algebra.
At this time, means logarithm mapping that changes from SE(3) to twist . For more information on logarithm mapping, refer to this post. Therefore, the SE(3) version relative pose error is changed as follows.
Here's how to break it down in detail:
Looking at the above expression, you can see that and parameters in are connected by exponential mapping. Applying the left perturbation model to the formula in the second line of the above equation to express the incremental amount is as follows.
In the above equation, the terms are arranged in the form of by moving the increment term to the left or right. To do this, we need to use the following property of the adjoint matrix of SE(3). For more information on Adjoint martix, refer to the this post.
The expression above is transformed into an expression for as follows.
And rearranging, we get the following formula:
() moves the term in the middle of () to the right or left. This post explains the process of moving to the right. If this is expanded by and respectively, it is as follows.
In order to simplify this expression, and are respectively expressed as follows.
:Transformation matrix expressed in exponential terms.
According to the definition of () above, .
The above equation can be rearranged using the right-hand BCH approximation. The right BCH approximation is
Finally, after solving the substitution and rewriting the and expressions, the SE(3) version formula of () is obtained.
Therefore, the SE(3) version Jacobian of the final relative pose error is as follows.
At this time, since is a complicated expression, it is generally used by approximating it as follows, or it is used by setting it as .
If and optimization is performed, there is a reduction in the amount of computation, but the optimization performance is similar to the above Jacobian. The method used has a slight predominance. For more information, please refer to Chapter 11 of Introduction to Visual SLAM.
In the g2o code above, the error is , which makes the Jacobian slightly different from the description above.
This is the same format as in the () expression where is arranged by passing terms to the left, and is arranged by passing terms to the right and merged.
It also seems to be an approximate value for . Therefore, the actual implemented code is as follows.
6. Line reprojection error
Line reprojection error is an error used when optimizing a line in 3D space expressed in plücker coordinates. For more information on Plücker coordinates, see the Plücker Coordinate 개념 정리 post.
NOMENCLATURE of line reprojection error
: Transformation matrix of Plücker lines
: line intrinsic matrix
: Rotation matrix of a 3D line
: A matrix containing information on the distance a 3D line is from the origin
: Parameter corresponding to rotation matrix SO(3)
: Parameter corresponding to rotation matrixSO(2)
: th column vector
: State variable
: State variables in orthonormal representation
: For the update method through Lie theory, refer to this link
: An operator that can update the state variable in one step.
A line in 3D space can be expressed as a 6D column vector using Plücker coordinates.
Unlike the order described above, most papers using Plücker Coordinate use the order, so this section Also, lines are expressed in the corresponding order. Since the linear expression method has scale ambiguity, it has 5 degrees of freedom, and even if and are not unit vectors, a line can be uniquely expressed by the ratio of the values of the two vectors.
6.1. Line Transformation and projection
If is a line viewed from the world coordinate system, it can be converted as follows when viewed from the camera coordinate system.
- :Transformation matrix for the Plücker line
Projecting the line onto the image plane gives:
- : 직선의 내부 파라미터 행렬(line intrinsic matrix)
means to . So , so the term of is cleared to 0. Therefore, when , the following expression is derived.
6.2. Line reprojection error
The reprojection error of a line can be expressed as follows.
This can be expressed through the formula for the distance between a point and a line. At this time, and mean the start and end points of the line extracted using line feature extracture (e.g., LSD), respectively. . In other words, is the predicted value obtained through modeling, and the line connecting and is measured through sensor data. becomes an observation.
6.3. Orthonormal representation
A problem arises if the Plücker Coordinate expression is used as it is when BA optimization is performed using obtained above. Since the Plücker Coordinate always has 5 degrees of freedom because it must satisfy the Klein quadric constraint of , the minimum number of parameters that can express a line is 4 It is over-parameterized. The disadvantages of the over-parameterized expression method are as follows.
Since redundant parameters must be calculated, the amount of computation increases during optimization.
The additional degrees of freedom can lead to numerical instability problems.
Whenever a parameter is updated, it must always be checked whether the constraint is satisfied.
Therefore, when optimizing a line, an orthonormal expression is generally used to change the minimum parameter to four degrees of freedom. In other words, when expressing a line, Plücker Coordinate is used, but when performing optimization, it transforms into orthonormal expression, updates the optimal value, and returns to Plücker Coordinate.
Orthonormal expression is as follows. A line in 3D space can always be expressed as:
Any Plücker line always has a one-to-one correspondence , and this representation method is called orthonormal representation. Given a line in the world, can be obtained by performing QR decomposition of .
At this time, the element of the upper triangle matrix always becomes 0 due to the Plücker constraint (Klein quadric). Since and mean 3D and 2D rotation matrices, respectively, .
When performing real optimizations, they are updated like . So the orthonormal representation is a line in 3D space can be expressed with four degrees of freedom. updated through optimization is converted to as follows.
6.4. Error function formulation
In order to optimize the reprojection error for a line, iteratively using a nonlinear least squares method such as Gauss-Newton (GN) or Levenberg-Marquardt (LM) to find the optimal variable. need to update The expression of the error function using the reprojection error is as follows.
The satisfying can be calculated iteratively through non-linear least squares. The optimal state is found by iteratively updating in small increments .
Strictly speaking, since the state increment includes the SE(3) transformation matrix, it is correct to add it to the existing state through the operator, but for convenience of expression The operator is used.
The above equation can be expressed as follows through Taylor's first-order approximation.
By differentiating this, the optimal value of increment is obtained as follows. The detailed derivation process is omitted in this section. If you want to know more about the induction process, you can refer to the previous section.
6.4.1. The analytical jacobian of 3d line
As explained in the previous section, to perform nonlinear optimization we need to compute . consists of:
can be expanded as follows:
can be obtained as follows. At this time, note that is a vector and is a scalar.
can be obtained as follows.
can be obtained as follows.
The Jacobian for the orthonormal representation can be obtained as follows:
The Jacobian for the camera pose can be obtained as follows: