Publication:

Do All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models

Loading...
Thumbnail Image

Date

2025-03-14

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Badrinath, Charumathi. 2024. Do All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models. Bachelors Thesis, Harvard University Engineering and Applied Sciences.

Abstract

Generative image models learn to produce images by transforming vectors drawn from a marginally simple latent space. Extensive research has shown that latent spaces develop meaningful structure; images with different values of an attribute (e.g. bright versus dark lighting) correspond to vectors from different parts of the latent space. Less work has gone into comparing the latent spaces of different models. In this thesis, we investigate this question, building on a preliminary work that showed that it is possible to take a vector in the latent space of one model corresponding to some image and lin- early map it to a vector in the latent space of another model corresponding to a very similar image. Using this methodology in tandem with other metrics, we compare 4 model architectures trained on 2 datasets and find that much of the semantic structure between their latent spaces is the same up to a linear transformation. The level of similarity is highest between models with similar architectures as well as between expressive models. The set of similarly represented features is not always intuitive; while gender is represented similarly in all face-generating models, class (e.g. airplane, horse, etc...) is represented differently among multi-class image generating models. We also use linear maps as tools to investigate how the structure of a face-generating model’s latent space changes during training, concluding that the representation of gender-related concepts becomes more disjoint while that of orthogonal concepts like skin tone remain stable. We also find that intermediate latent spaces in models with hierarchical latent space structures are similar.

Description

Other Available Sources

Research Data

Keywords

Computer science, Statistics, Artificial intelligence

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories