Publication: Modern Methods for Medieval Marseille: Prototyping Natural Language Processing Workflows for Entity Extraction from Highly Specialized Corpora
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Research Data
Abstract
This thesis constitutes an investigation into the ways in which current Nat-ural Language Processing (NLP) tools and methods can be applied to historical corpora in a variety of languages from the Medieval period. In doing so, it is most specifically concerned with a set of household inventories from early 15th century Marseille, written in Latin, and detailing the tangible objects associated with a vast set of individual households. For this corpus I have developed a process which extracts objects and their attributes from dependency parsed renderings of these inventories, retraining a model for dependency parsing based on their particularities and developing a methodology for extraction in order to create computationally relevant, searchable, versions of these material culture records, demonstrating along the way the divergences from typical process necessary to bridge contemporary NLP and historical language.