Scandia introduces. Handwritten Text Recognition and Processing with Large Language Models: Sources from New Sweden
DOI:
https://doi.org/10.47868/scandia.v92i1.29340Keywords:
Handwritten Text Recognition, Large Language Model, Colonialism, New Sweden, Empire, ChatGPTAbstract
This article describes a pilot study in which records from the Swedish National Archives concerning the colony of New Sweden (1638–1655) were transcribed, translated, and summarised using Handwritten Text Recognition (HTR) via Transkribus and the Large Language Model (LLM) ChatGPT. The project was carried out within the framework of ”Empire”, a research project at Trinity College Dublin that examines Ireland’s role in the building of the English empire during the seventeenth and eighteenth centuries. The aim was to test how effectively these tools could be used to process this extensive material (over 130 folio pages) with documents in Swedish, German, Dutch, and Latin. The article explains the choice of HTR models, particularly how a combination of Transkribus and transformer-based models such as ”Text Titan I” and ”Swedish Lion I” were used to optimise speed and reach a level of acceptable accuracy. An important aspect was to assess the ability of LLMs to summarise and translate fragmentary and partially illegible texts, as well as to weigh accuracy against efficiency. The article also presents an example from a 1643 court case between Governor Johan Printz and the English merchant George Lamberton, demonstrating the method’s potential for interpreting complex sources. Finally, the author discusses the future potential of this technology and possible avenues for further development, emphasising that although the technology requires human oversight, it is a powerful labour-saving tool that can transform how historical material is made accessible and analysed.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.