Publication: Modern Methods for Medieval Marseille: Prototyping Natural Language Processing Workflows for Entity Extraction from Highly Specialized Corpora
No Thumbnail Available
Open/View Files
Date
2022-06-03
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Humphrey, Emma Louise. 2022. Modern Methods for Medieval Marseille: Prototyping Natural Language Processing Workflows for Entity Extraction from Highly Specialized Corpora. Bachelor's thesis, Harvard College.
Research Data
Abstract
This thesis constitutes an investigation into the ways in which current Nat-ural Language Processing (NLP) tools and methods can be applied to historical corpora in a variety of languages from the Medieval period. In doing so, it is most specifically concerned with a set of household inventories from early 15th century Marseille, written in Latin, and detailing the tangible objects associated with a vast set of individual households. For this corpus I have developed a process which extracts objects and their attributes from dependency parsed renderings of these inventories, retraining a model for dependency parsing based on their particularities and developing a methodology for extraction in order to create computationally relevant, searchable, versions of these material culture records, demonstrating along the way the divergences from typical process necessary to bridge contemporary NLP and historical language.
Description
Other Available Sources
Keywords
Computer science, History
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service