はじめに

興味をもったきっかけは以下のtweetです。

本論文ではvSLAMの新しい最適化手法を提案. 従来の最適化手法(BA)で手がかりとしていた3D地図を使わず, カメラの方向のみを使って最適化するため, BAが抱えていた複雑さが解消され, カメラの動きによりロバストなSLAMを実現した.https://t.co/EIjImeKi1L pic.twitter.com/TqbkMWnTAl
— まさや (@syinari0123) 2019年2月12日

vSLAMはSLAMの中でも更に全然わからないのですが、バンドル調整(Bandle Adjustment)(某界隈でBAと略すらしい）を使ってないのは興味深いです。なので、論文を読むことにしました。
~~気合で一通り目を通しましたが、軽い気持ちで読むんじゃなかったと後悔しました。~~

[1902.03747] Visual SLAM: Why Bundle Adjust?

概要

バンドル調整に基づくＳＬＡＭの問題点

バンドル調整を慎重に初期化する必要があり、複雑。
(十分なベースラインが必要なので、）スローモーションまたは純粋な回転モーションが苦手。

提案手法

バンドル調整の代わりに、カメラの向きだけを徐々に最適化するために回転の平均化(rotation averaging)を行う。
提案手法により、位置と三次元地図をキーフレームレートで推定および維持する必要がなくなるだけでなく（よりシンプルなSLAMシステムになるだけでなく）、スローモーションや純粋な回転モーションにも対応できるようになる。

本文

序論

従来のBA(Bandle Adjustment) SLAM

f:id:ssk0109:20190213102201p:plain:w300

ここで、
ui, j is the 2D coordinates of the i-th scene point as seen in the j-th image Zj.
structure-from-motion (SfM) aims to estimate the 3D coordinates X = {Xi} of the scene points and 6DOF poses {(Rj,tj)} of the images {Zj}.

bundle adjustment (BA) formulation

f:id:ssk0109:20190213105305p:plain:w300

提案手法 L-INFINITY SLAM

f:id:ssk0109:20190213101952p:plain:w300 　　

rotation averaging formulation

Given a set of relative rotations {Rj,k} between pairs of overlapping images {Zj ,Zk}, the goal of rotation averaging is to estimate the absolute rotations {Rj}

f:id:ssk0109:20190213102554p:plain:w300

f:id:ssk0109:20190213102729p:plain:w300
Unlike (1) which minimises the sum of squared reprojection error,
(4) minimises the maximum reprojection error.

アルゴリズム詳細

A. Estimating relative motions

Rj,k はrotationally aligning backprojected feature rays by using a rotation only variant of Trimmed ICPで推定している。

B. Rotation averaging

式(3)を解くために an iteratively reweighted least-squares approach in SO(3)を使用。

C. Known rotation problem

By referring to R(1:2)j as the first two rows of Rj, and to R(3)j
as the third row of Rj (similarly for t(1:2) and t(3)), the projection of Xi onto the j-th image is given by

f:id:ssk0109:20190214011930p:plain:w300
式(6)を式(4)に適用し式変形すると、以下の最適化問題になる。 f:id:ssk0109:20190214012152p:plain:w300
f:id:ssk0109:20190214012716p:plain:w300

これをRes-Intと呼ばれる方法で解く。Res-Intはこの問題を about 3 seconds for moderate size input (around 15 images and 3000 3D points)で解く。
Res-Int はLine 5 in Algorithm 2の the known rotation prob routine に当たる。

D. Known rotation(KRot) problem with translation direction constraints

ループが検出される時入力サイズは大きくなるので、Step 9 in Algorithm 2のKRotを解くには非常に時間がかかる。
そこで、カメラの並進方向を組み込んだ定式化を使用して、入力のサンプルに対するループクロージャを解決することを提案する。
カメラ位置の制約は以下で表わせられる。
f:id:ssk0109:20190214020429p:plain:w300
式(12)はKnown rotation problem において式(13)の角度制約に一致する。