MURAL - Maynooth University Research Archive Library



    Web service for 19th century Irish personal name matching


    Wangrungarun, Phattara (2015) Web service for 19th century Irish personal name matching. Masters thesis, National University of Ireland Maynooth.

    [thumbnail of 13011_PHATTARA_WANGRUN.pdf]
    Preview
    Text
    13011_PHATTARA_WANGRUN.pdf

    Download (3MB) | Preview

    Abstract

    Before the first Irish civil registration on 1864, census materials were mostly lost or incomplete. So genealogical research uses parish records and also some ‘census substitute’ documents, such as land ownership and tenancy records. However, some of these documents may not contain enough information in identify individuals. Some of them contains a name and address, whereas others might contain only a name. Record linkage is one method to gather scattered information among many documents. It uses a person's name as a reference to link that person's information between many documents.With patience, a more complete information about that person can be obtained. Therefore linking or matching a person's name is important in the process. Unfortunately, in the 19th century, in Ireland, there was no standard spelling of names, handwriting could be difficult to read and contractions or abbreviations were often used. The names with the same pronunciation and for the same individual could be written in many different ways. Moreover, names in the Irish language which are equivalent to English names were used, for example, Irish version of ‘Smith’ could be ‘Gowan’. A further complication is that historical and genealogical research often requires large quantities of names to be matched. To handle these name variations, various solutions have been created to find matching different names that refer to the same person. However, for our extent knowledge, there is yet no public system which encodes those solutions together and provides a service of bulk name matching. Thus, we developed a web service system using Ruby on Rails framework to achieve our goal. The system is initially encoded with 4 matching algorithms, Levenshtein distance, soundex, Irish soundex, and lookup table. We also present a web interface for a client to use the system from the web browser. It is designed to be simple and extensible from using inheritance. The system performs matchings on large quantities of names in a reasonable time. We test our system with 12,944 name matchings and the result were completed in no more than half a minute (28,786 milliseconds, to be precise). However, the system consumes a large amount of memory (around 373 megabytes). We believe that, with proper optimisation, we would reduce the memory usage along with a shortened processing time. Further matching algorithms could also be implemented for names in other languages, so that it can handle a broader domain of names.
    Item Type: Thesis (Masters)
    Additional Information: Taught Masters Thesis for the Erasmus Mundus MSc in Dependable Software Systems
    Keywords: Web service; 19th century; Irish personal name matching;
    Academic Unit: Faculty of Science and Engineering > Computer Science
    Item ID: 7092
    Depositing User: IR eTheses
    Date Deposited: 04 May 2016 11:18
    URI: https://mural.maynoothuniversity.ie/id/eprint/7092
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only (login required)

    Item control page
    Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads