MURAL - Maynooth University Research Archive Library



    Engineering an Aligned Gold-Standard Corpus of Human to Machine Oriented Controlled Natural Language


    Safwat, Hazem and Davis, Brian and Zarrouk, Manel (2018) Engineering an Aligned Gold-Standard Corpus of Human to Machine Oriented Controlled Natural Language. In: 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE. ISBN 9781538673256

    [img]
    Preview
    Download (172kB) | Preview


    Share your research

    Twitter Facebook LinkedIn GooglePlus Email more...



    Add this article to your Mendeley library


    Abstract

    Knowledge base creation and population are an essential formal backbone for a variety of intelligent applications, decision support and expert systems and intelligent search. While the abundance of unstructured text helps in easing the knowledge acquisition gap, the ambiguous nature of language tends to impact accuracy when engaging in more complex semantic analysis. Controlled Natural Languages (CNLs) are subsets of natural language that are restricted grammatically in order to reduce or eliminate ambiguity for the purposes of machine processability, or unambiguous human communication within a domain or industry context, such as Simplified English. This type of human-oriented CNL is under-researched despite having found favor within industry over many years. We describe a novel dataset which aligns a representative sample of Simplified English Wikipedia sentences with a well known machine-oriented CNL. This linguistic resource is both human-readable and semantically machine interpretable and can benefit a variety of NLP and knowledge based applications.

    Item Type: Book Section
    Additional Information: This work has emanated from research supported in part by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289 and in part by the SSIX Horizon 2020 project (grant agreement No 645425).
    Keywords: Natural Language Processing; Controlled Natural Language; Knowledge Extraction; Semantic Web;
    Academic Unit: Faculty of Science and Engineering > Computer Science
    Item ID: 13419
    Identification Number: https://doi.org/10.1109/WI.2018.00-58
    Depositing User: Brian Davis
    Date Deposited: 07 Oct 2020 15:18
    Publisher: IEEE
    Refereed: Yes
    Funders: European Union Horizon 2020 programme
    URI:
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only(login required)

    View Item Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads