Publication:
Geometric Methods for Quantitative Analysis of Romance Languages

No Thumbnail Available

Date

2024-11-26

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

McDonald, Patrick William. 2024. Geometric Methods for Quantitative Analysis of Romance Languages. Bachelor's thesis, Harvard University Engineering and Applied Sciences.

Research Data

Abstract

Previous work has introduced various quantitative methods to investigate the historical and/or phonetic interrelation of languages and their speakers. Additionally, environments such as hyperbolic space have been found (both theoretically and empirically) to be conducive to representing hierarchically-structured datasets, such as phylogenetic cell data. This thesis tests the suitability of hyperbolic space for representing pronunciation data from several Romance languages, a linguistic family that apparently developed per a hierarchical structure – i.e., one where modern languages are interrelated via tree-like descent from common ancestors. The thesis involves Python implementations of a.) a pipeline that transforms audio files into workable mathematical objects and b.) baseline methods for the aggregation and analysis of this speech data with respect to language-wise covariance structures. We then outline a framework for analyzing the speech data in a hyperbolic setting, whose performance we compare to that of the baseline methods on the tasks of a.) language space reconstruction and b.) interspeaker interpolation. We find that with proper hyperparameter tuning, the Poincaré disk model of hyperbolic geometry is indeed capable of representing the language space and speaker interrelations apparent in our Romance language dataset, suggesting that the hyperbolic setting could be a promising quantitative framework for future linguistic analysis.

Description

Other Available Sources

Keywords

Applied mathematics, Computer science, Linguistics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories