Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input video instead of image for inference. #45

Open
loic-combis opened this issue Feb 22, 2024 · 0 comments
Open

Input video instead of image for inference. #45

loic-combis opened this issue Feb 22, 2024 · 0 comments

Comments

@loic-combis
Copy link

Hi there,

I'm trying to test style avatar to correct the lip sync generation from Wav2Lip.

(I currently have an issue with the preprocessing in Faceverse LizhenWangT/FaceVerse#36)

But regardless, we aim to do the following:

  • (1) Train Style avatar on the original video/face.
  • (2) Run Wav2Lip on the original video with a new audio, to get a new video with but poor quality lip sync.
  • Question for this step: Can we use the fine-tuned style avatar model (from step 1), to correct the lip movement on the video generated step 2?

Note that we try to avoid inputing a single image and recreate the whole speech on the avatar, and rather trying to simply correct the video.

Thanks for your help and your work!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant