Video Synthesis from the StyleGAN Latent Space

Publication Year2020

0
Citations
3,256
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Usage
3,256
- Downloads
  2,891
- Abstract Views
  365

Thesis / Dissertation Description

Generative models have shown impressive results in generating synthetic images. However, video synthesis is still difficult to achieve, even for these generative models. The best videos that generative models can currently create are a few seconds long, distorted, and low resolution. For this project, I propose and implement a model to synthesize videos at 1024x1024x32 resolution that include human facial expressions by using static images generated from a Generative Adversarial Network trained on the human facial images. To the best of my knowledge, this is the first work that generates realistic videos that are larger than 256x256 resolution from single starting images. This model improves the video synthesis in both quantitative and qualitative ways compared to two state-of-the-art models: TGAN and MocoGAN. In a quantitative comparison, this project reaches a best Average Content Distance (ACD) score of 0.167, as compared to 0.305 and 0.201 of TGAN and MocoGAN, respectively.

Bibliographic Details

DOI10.31979/etd.ywry-3qps

REPOSITORY URLhttps://scholarworks.sjsu.edu/etd_projects/924

URL IDhttps://scholarworks.sjsu.edu/etd_projects/924; https://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1922&context=etd_projects&unstamped=1; http://dx.doi.org/10.31979/etd.ywry-3qps; https://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1922&context=etd_projects; https://dx.doi.org/10.31979/etd.ywry-3qps; https://scholarworks.sjsu.edu/etd_projects/924/

AUTHOR(S)

Lei Zhang

PUBLISHER(S)

San Jose State University Library

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know