Saturday, August 16, 2014

Ville Marttila: Creating Digital Editions for Corpus Linguistics

Ville Marttila is defending his thesis "Creating Digital Editions for Corpus Linguistics - The case of Potage Dyvers, a family of six Middle English recipe collections". Professor Graham Caie (University of Glasgow) serves as the opponent, and Professor Minna Palander-Collin as the custos.

In his lectio precursoria, Marttila explained how his work consists of two main parts. From historical linguistics point of view, the focus is in Middle English recipe collections. The other contribution, highly relevant for digital humanities, is the development of coherent methodology for contextualized study of historical texts using corpus-linguistic methods. A central objective has been to go beyond the traditional focus on language as disembodied sequences of written or spoken utterances, represented in digital form as linear streams of character data. Marttila describes the broadening in the following way in the introduction of his thesis:

"[...] new trends in historical linguistics [...] emphasising the importance of studying historical language use in its original context. Since the situational context and much of the cultural context of historical texts is inaccessible to us, the importance of the documentary context is relatively much greater than for present-day texts, both because it is all we have, and because the pragmatic functions of its material features are less well known." (p. 1)

One important contribution of the work is based on the idea of layered annotation. Marttila describes in detail documentary, descriptive and analytical annotation. By using annotation overlays, one can highlight the differing ontological status of the descriptive annotation versus documentary and analytical annotation. The division also can also be useful in defining editorial responsibilities. The whole has been built on existing standard solutions when possible putting these together as a coherent whole.

The opponent, professor Caie started his remarks very positively, stating that the thesis is absolutely brilliant and even characterized it the most impressive thesis he has read during his 40 years career. He also mentioned about the VARIENG unit at University of Helsinki as one of the leading centers of historical corpus linguistics. The opponent characterized the work to have content even for two or three theses, being impressive not only in quantity but also in quality.