Skip to content

shani1610/object-insertion-video-diffusion

Repository files navigation

Object Insertion into a Video Using Diffusion model

This repository contains technical explanations relating to the investigation of using tuning methods based on the stable diffusion model to insert objects into human-object interaction videos.

Table of Contents

Installation

To reproduce my results or use Tune-A-Video for other goals, I wrote detailed installation guidelines that can be found here. I also included the conda list results mentioning all the packages in the working environment and their versions, you can find it here.

Usage

Details on how to use the project.

suitcase

Human Evaluation

the survey can be found in this link

Tree

.
└── object-insertion-video-diffusion/
    ├── docker
    ├── human_evaluation/
    │   ├── Images/
    │   │   └── ...
    │   ├── data_survey.csv
    │   └── analyzing_survey.ipynb
    ├── tools
    ├── Tune-A-Video/
    │   ├── data (extract here)/
    │   │   └── my_pairs/
    │   │       ├── original/
    │   │       │   ├── object1.mp4
    │   │       │   ├── object2.mp4
    │   │       │   └── ...
    │   │       └── pretending/
    │   │           ├── object1.mp4
    │   │           └── object2.mp4
    │   ├── configs/
    │   │   ├── original/
    │   │   │   ├── object1.yml
    │   │   │   └── ...
    │   │   └── pretending
    │   ├── scripts
    │   └── infer_args.py
    ├── pod.yml
    ├── README.md
    ├── conda_list.md
    └── tuneavideo_installation.md

Contributing

Guidelines for contributing to the project.

License

If you use the dataset provided or any other part from this work please cite using

@inproceedings{objectinsert2024,
  title={Object Insertion into a Video Using Diffusion model},
  author={Israelov, Shani}
  year={2024}
}

Acknowledgements

This work utilizes Tune-A-Video

@inproceedings{wu2023tune,
  title={Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation},
  author={Wu, Jay Zhangjie and Ge, Yixiao and Wang, Xintao and Lei, Stan Weixian and Gu, Yuchao and Shi, Yufei and Hsu, Wynne and Shan, Ying and Qie, Xiaohu and Shou, Mike Zheng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={7623--7633},
  year={2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages