MURAL - Maynooth University Research Archive Library



    Strategies in tracing linguistic variation in a corpus of Old Irish texts (CorPH)


    Stifter, David and Qiu, Fangzhe and Aquino-López, Marco A. and Bauer, Bernhard and Lash, Elliott and White, Nora (2022) Strategies in tracing linguistic variation in a corpus of Old Irish texts (CorPH). International Journal of Corpus Linguistics, 27 (4). pp. 529-553. ISSN 1384-6655

    [img] Download (1MB)


    Share your research

    Twitter Facebook LinkedIn GooglePlus Email more...



    Add this article to your Mendeley library


    Abstract

    This article introduces Corpus PalaeoHibernicum (CorPH), a corpus currently consisting of 78 texts in Early Irish (c. 7th–10th cent.) created by the ERC-funded Chronologicon Hibernicum (ChronHib) project by bringing together pre-existing lexical and syntactic databases and adding further crucial texts from the period. In addition to being annotated for POS, morphological and syntactic information, another layer of annotation has been developed for CorPH – ‘Variation Tagging’, i.e. a tagset that numerically encodes synchronic language variation during the Early Irish period, thus allowing for much improved research on the chronological variation among the material. Another new pillar of studying linguistic variation is Bayesian Language Variation Analysis (BLaVA), in order to address the challenge that “not-so-big data” poses to statistical corpus methods. Instead of reflecting feature frequencies, BLaVA models language variation as probabilities of variation.

    Item Type: Article
    Keywords: Bayesian statistics; Chronologicon Hibernicum; diachronic variation; Old Irish; variation tagging;
    Academic Unit: Faculty of Arts & Humanities > School of Celtic Studies > Early Irish (Sean Ghaeilge)
    Item ID: 16892
    Identification Number: https://doi.org/10.1075/ijcl.22018.sti
    Depositing User: Prof. David Stifter
    Date Deposited: 02 Feb 2023 11:52
    Journal or Publication Title: International Journal of Corpus Linguistics
    Publisher: John Benjamins Publishing
    Refereed: No
    URI:
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only(login required)

    View Item Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads