Publication:
Modern Methods for Medieval Marseille: Prototyping Natural Language Processing Workflows for Entity Extraction from Highly Specialized Corpora

No Thumbnail Available

Date

2022-06-03

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Humphrey, Emma Louise. 2022. Modern Methods for Medieval Marseille: Prototyping Natural Language Processing Workflows for Entity Extraction from Highly Specialized Corpora. Bachelor's thesis, Harvard College.

Research Data

Abstract

This thesis constitutes an investigation into the ways in which current Nat-ural Language Processing (NLP) tools and methods can be applied to historical corpora in a variety of languages from the Medieval period. In doing so, it is most specifically concerned with a set of household inventories from early 15th century Marseille, written in Latin, and detailing the tangible objects associated with a vast set of individual households. For this corpus I have developed a process which extracts objects and their attributes from dependency parsed renderings of these inventories, retraining a model for dependency parsing based on their particularities and developing a methodology for extraction in order to create computationally relevant, searchable, versions of these material culture records, demonstrating along the way the divergences from typical process necessary to bridge contemporary NLP and historical language.

Description

Other Available Sources

Keywords

Computer science, History

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories