Publication:
Quantifying Similarity in Machine Learning Models

No Thumbnail Available

Date

2020-08-27

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Greene, Andrew Marc. 2020. Quantifying Similarity in Machine Learning Models. Master's thesis, Harvard Extension School.

Research Data

Abstract

One challenge in developing Machine Learning models, especially in the context of domain adapation, is the difficulty in assessing the degree of similarity in the learned representations of two model instances. This is especially challenging when the instances do not share an underlying model architecture. Centered Kernel Alignment (CKA) is a promising technique that has been applied to compare layer-level activations between model instances using data from a particular domain. We hypothesize that CKA can be used effectively in two additional contexts. First, can we gain insights into redundant filters by applying CKA within layers? Second, can we better understand and guide the process of domain adaptation by comparing CKA results using data from the source and target domains? We train a family of instances of a denoising autoencoder model, using two datasets: a natural-image domain comprising photographs of house numbers, and a synthetic-image domain simulating text on a page. We then fine-tune each instance on a much smaller subset of data from the opposite domain. We use CKA to compare the resulting model instances and demonstrate how to interpret the results to gain insights into the domain adaptation process. With this approach, we establish that the fine-tuned model instances retain more similarity with the checkpoints from which they are derived than with the corresponding models that have been trained from scratch on the same random initializations. This result holds even when the accuracy of the fine-tuned and from-scratch models are the same. We also confirm the theoretical principle that the domain adaptation mostly occurrs in the later convolutional layers, while the low-level convolution layers retain mostly equivalent representations.

Description

Other Available Sources

Keywords

machine learning, domain adaptation

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories