Bachelor's Thesis

Hosting a pdf version of my bachelor's thesis which involved generating videos that are in sync with a given audio segment. This involved a novel architecture trained on top of the latent space of a pretrained StyleGAN (FFHQ). The details are in the pdf of the thesis and a set of slides that I presented during my viva.

This framework is presented below.

Given an initial latent an LSTM model is used to predict the set of residuals corresponding to motion for each frame. We use a Facenet based Identity loss, an Audio Visual Sync Loss and a Novel Landmark Regression Loss to ensure that the new faces have the same identity and the right kind of motions.Some of our results on HD youtube videos that do not belong to our dataset are shown below

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images		images
Bachelors_thesis.pdf		Bachelors_thesis.pdf
Bachelors_thesis_ppt.pdf		Bachelors_thesis_ppt.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bachelor's Thesis

About

Releases

Packages

rsn870/Bachelor-s-Thesis

Folders and files

Latest commit

History

Repository files navigation

Bachelor's Thesis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages