This project implements 6-DOF camera pose estimation from a calibrated stereo camera (with local non-linear least square optimisation) for various scenarios in KITTI dataset
$ conda env create -f setup/environment.yml
$ pip install -e .
For simulation of visual odometry, run the followig command
$ python main.py --config_path configs/params.yaml
The params.yaml
needs to be edited to configure the sequence to run the simulation.
1. What is Visual Odometry ?
2. Problem Formulation
3. Algorith Implemented
Visual Odometry is the process of incrementally estimating the pose and trajectory of a robot or a vehicle (orientation and translation of a camera configuration rigidly attached to it) using video stream from the camera.
An agent is moving through an environment and taking images with a rigidly attached camera system at discrete time instants. Let the stream of images coming from the pair of camera (assumed stereo configuration) be denoted by IL, k and IR, k at time instant k . We assume that we have prior knowledge of all the intrinsic as well as extrinsic calibration parameters of the stereo rig.
We need to estimate the relative rotation R and translation t between stereo configuration at time instants k-1 and k and then to concatenate the transformation to incrementally recover the full trajectory Ck of the camera
3D-to-2D: Structure to feature correspondences (Source : [1])
- Compute the first stereo image frames IL,K and IR,K
- Extract and match stereo features fL,K and fR,K
- Triangulate features to build point cloud Xk
- Set initial camera pose Ck
- Store information from the first frame as IL,k-1, IR,k-1, fL,k-1, fR,k-1, Xk-1
While exists a new image frame:
- Compute the new stereo image pair IL,K and R,K
- Extract and match stereo features fL,K and fR,K
- Triangulate features to build point cloud Xk
- Track 2D features fL,k-1 at IL,k-1 to fL,k at IL,K and thus obtain tfk-1,k
- Compute correspondence for the tracked features tfk-1,k
and Xk-1 - Compute camera pose estimation (P3P), thus T = [R|t]
- Concatenate transformation by Ck = Ck-1 Tk
- If Optimisation is enabled, do non-linear least squares optimisation of T
- Store informaton from first frame as IL,k-1, IR,k-1, fL,k-1, fR,k-1, Xk-1 and Ck-1
- Compute the new stereo image pair IL,K and R,K
Relative Camera Pose and Concatenation of Transformations (Source: E. F. Aguilar Calzadillas [1] )
- Implement Windowed Bundle Adjustment
- Implement Graph Based Optimisation in Visual SLAM
- Visualise 3D point cloud of scene using Point Cloud Library (PCL)
[1] E. F. Aguilar Calzadillas, "Sparse Stereo Visual Odometry with Local Non-Linear Least-Squares Optimization for Navigation of Autonomous Vehicles", M. A. Sc. Thesis, Department of Mechanical and Aerospace Engineering, Carleton University, Ottawa ON, Canada, 2019
[2] D. Scaramuzza, F. Fraundorfer, "Visual Odometry: Part I - The First 30 Years and Fundamentals", IEEE Robotics and Automation Magazine, Volume 18, issue 4, 2011
[3] F. Fraundorfer, D. Scaramuzza, "Visual odometry: Part II - Matching, robustness, optimization, and applications", IEEE Robotics and Automation Magazine, Volume 19, issue 2, 2012
[4] Avi Singh, Visual Odmetry from scratch - A tutorial for beginners, Avi Singh's Blog
Saksham Jindal ([email protected])