Robust Visual Pose Estimation for in the Wild Videos of Spot

Semester Thesis, ETH Zurich, Autumn Semester 2023

Thesis Report | Slides (Keynote) | Slides (PDF)

Set-up, training and testing custom pose estimation pipelines is non-trivial. It can be a tedious and time-consuming process. This repository aims to simplify this.

The main contributions can be summarized as follows:

a Docker container ready to run an extended version of OnePose++
OnePose++ extended with:
- DeepSingleCameraCalibration for running inference on in-the-wild videos
- CoTracker2 for pose estimation optimization, improving the pose tracking performance by leveraging temporal cues as well¹.
A low-entry demo to help understand the whole pipeline and readily debug/test the code.
custom data for Spot & instructions on how you can create the synthetic data for your own use-case

Installation

Hardware

Having a CUDA-enabled GPU is a must. The code was tested on the following GPUs:

NVIDIA GeForce RTX 2080

with the following OS & driver versions:

DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NVIDIA-SMI (Driver Versions) 470.223.02   
CUDA Version: 11.4
Docker Version: 24.0.7, build afdd53b

Code

Set up the code by cloning the repository, initializing the submodules and downloading the necessary models and demo data:

git clone git@github.com:mizeller/OnePose_ST.git
cd OnePose_ST
git submodule update --init --recursive
mkdir -p data weight

The pre-trained models for OnePose++, LoFTR and CoTracker2 as well as the demo data can be found here. Place the model files in weight/ and the demo data in data/.

At this point, the project directory should look like this:

.
├── assets
...
├── data
│   └── spot_demo
├── submodules
│   ├── CoTracker
│   ├── DeepLM
│   └── LoFTR
└── weight
    ├── LoFTR_wsize9.ckpt 
    ├── OnePosePlus_model.ckpt
    └── cotracker2.pth

Docker

To set up the docker container either build it locally

docker build -t="mizeller/spot_pose_estimation:00" .

or pull a pre-built container from DockerHub:

docker pull mizeller/spot_pose_estimation:00

Next, the container needs to be run. Again, there are several options to do this.

In case you're using VSCode's devcontainer feature, simply press CTRL+SHIFT+P and select Rebuild and Reopen in Container. This will re-open the project in a docker container.

Alternatively, you can run the docker container directly from the terminal. The following command mounts the ${REPO_ROOT} in the container. Note that the shared memory size is set to 32GB, change it to your hardware if necessary.

REPO_ROOT=$(pwd)
docker run --gpus all --shm-size=32g -w /workspaces/OnePose_ST -v ${REPO_ROOT}:/workspaces/OnePose_ST -it mizeller/spot_pose_estimation:00

Demo: Training & Inference

To test the set up (training and inference), run the demo script from a terminal in the docker container: sh demo.sh. This will run the following steps:

Parse the demo data
Train the OnePose++ model for Spot
Run inference on the demo data captured using my phone

The results will be saved in the temp/ directory.

FYI: There are also custom debug entry points for each step of the pipeline. Have a look at the .vscode/launch.json.

Training Data

TODO: add comments about synthetic data pipeline & clean up the other repo as well

Acknowledgement & License

This repository is essentially a fork of the original OnePose++ repository - for more details, have a look at the original source here. Thanks to the original authors for their great work!

This repository uses several submodules, please refer to the respective repositories for their licenses.

Credits

This project was developed as part of the Semester Thesis for my (Michel Zeller) MSc. Mechanical Engineering at ETH Zurich. The project was supervised by Dr. Hermann Blum (ETH, Computer Vision and Geometry Group) and Francesco Milano (ETH, Autonomous Systems Lab).

Note: As of this writing, CoTracker2 is still a work-in-progress. The online tracker can only run on every 4th frame which does not suffice for optimizing the pose estimation. That's why we currently use CoTracker as a post-processing step to optimize the poses for a given sequence. The 'yet' in this reply by the authors suggests that this feauture will be added to CoTracker in the future. A possible initial implementation is on this feature branch. It has not been updated in a while... ↩

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.devcontainer		.devcontainer
.vscode		.vscode
assets		assets
backup		backup
configs		configs
src		src
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo.sh		demo.sh
inference.py		inference.py
merge.py		merge.py
parse_lm_real_data.py		parse_lm_real_data.py
parse_scanned_data.py		parse_scanned_data.py
requirements.txt		requirements.txt
run.py		run.py
train_onepose_plus.py		train_onepose_plus.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Robust Visual Pose Estimation for in the Wild Videos of Spot

Semester Thesis, ETH Zurich, Autumn Semester 2023

Thesis Report | Slides (Keynote) | Slides (PDF)

Installation

Hardware

Code

Docker

Demo: Training & Inference

Training Data

Acknowledgement & License

Credits

About

Packages

Languages

License

mizeller/Spot-Pose-Estimation

Folders and files

Latest commit

History

Repository files navigation

Robust Visual Pose Estimation for in the Wild Videos of Spot

Semester Thesis, ETH Zurich, Autumn Semester 2023

Thesis Report | Slides (Keynote) | Slides (PDF)

Installation

Hardware

Code

Docker

Demo: Training & Inference

Training Data

Acknowledgement & License

Credits

Footnotes

About

Topics

Resources

License

Stars

Watchers

Forks

Packages 0

Languages

Packages