Publication: Quantifying Similarity in Machine Learning Models
No Thumbnail Available
Open/View Files
Date
2020-08-27
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Greene, Andrew Marc. 2020. Quantifying Similarity in Machine Learning Models. Master's thesis, Harvard Extension School.
Research Data
Abstract
One challenge in developing Machine Learning models, especially in the context of domain adapation, is the difficulty in assessing the degree of similarity in the learned representations of two model instances. This is especially challenging when the instances do not share an underlying model architecture. Centered Kernel Alignment (CKA) is a promising technique that has been applied to compare layer-level activations between model instances using data from a particular domain.
We hypothesize that CKA can be used effectively in two additional contexts. First, can we gain insights into redundant filters by applying CKA within layers? Second, can we better understand and guide the process of domain adaptation by comparing CKA results using data from the source and target domains?
We train a family of instances of a denoising autoencoder model, using two datasets: a natural-image domain comprising photographs of house numbers, and a synthetic-image domain simulating text on a page. We then fine-tune each instance on a much smaller subset of data from the opposite domain. We use CKA to compare the resulting model instances and demonstrate how to interpret the results to gain insights into the domain adaptation process.
With this approach, we establish that the fine-tuned model instances retain more similarity with the checkpoints from which they are derived than with the corresponding models that have been trained from scratch on the same random initializations. This result holds even when the accuracy of the fine-tuned and from-scratch models are the same. We also confirm the theoretical principle that the domain adaptation mostly occurrs in the later convolutional layers, while the low-level convolution layers retain mostly equivalent representations.
Description
Other Available Sources
Keywords
machine learning, domain adaptation
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service