O'Sullivan, Andrew and Keyes, Laura and Winstanley, Adam C.
(2006)
Developing Corpora for Statistical Graphical
Language Models.
In: International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV'06), 26-29 June 2006, Las Vegas.
Abstract
In this work Statistical Graphical Language Models (SGLMs), a technique adapted from
Statistical Language Models (SLMs), are applied to the task of graphical object recognition. SLMs are
used in Natural Language Processing for tasks such as Speech Recognition and Information Retrieval.
SGLMs view graphical objects as belonging to graphical languages and use this view to compute
probabilistic distributions of graphical objects within graphical documents. SGLMs such as N-grams
require large corpora of training data, which consist of graphical objects in contextual use (real world
graphical documents). Constructing corpora is an important stage in developing the models and many
issues need to be addressed. This paper discusses the development of graphical corpora and presents
approaches to some of the problems encountered.
Item Type: |
Conference or Workshop Item
(Paper)
|
Keywords: |
Graphics Recognition; Statistical Language Modelling; Corpora; |
Academic Unit: |
Faculty of Science and Engineering > Computer Science |
Item ID: |
4915 |
Depositing User: |
Dr. Adam Winstanley
|
Date Deposited: |
25 Apr 2014 15:38 |
Refereed: |
No |
URI: |
|
Use Licence: |
This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available
here |
Repository Staff Only(login required)
|
Item control page |
Downloads per month over past year
Origin of downloads