O'Donnell, Leah
(2021)
Clustering Single-Cell Electropherograms
by Genotype Through Unsupervised
Machine Learning.
Masters thesis, National University of Ireland Maynooth.
Abstract
Cells can be linked to the person who produced them by examining the information
contained within their DNA. The challenge that a forensic analyst faces is to question
whether a collection of cells obtained from a crime scene supports the hypothesis that a
person of interest was present. The primary challenge is that cell samples collected at
crime scenes typically contain material from an unknown number of genetic sources in an
unknown mixture ratio. The standard genetic measurement protocol used in crime labs
produces a single, combined signal for the entire collection of cells. If there are a small
number of contributors, cells are in good condition, and the mixture ratio is not overly
imbalanced, armed with this measurement, informative inference is possible for a trier of
fact. If, however, the sample is complex, containing more than three genetic sources, or if
the mixture ratio is highly imbalanced, or if genetic information within cells is degraded,
the ability to confidently extract meaning from the measured signal is impaired. In high
profile work published in the late 1990s it was demonstrated that genotype information
could be extracted from individual cells. When used in a forensics context, single-cell
methods offer a potential solution to the complex mixture problem by providing genetic
information per-cell rather than solely for the whole collection. Advances in those mea-
surement methods mean that single cell technologies may soon be practicable in crime
labs. Significant challenges on the interpretation of the signals that result, however, re-
main. Instead of having a single high dimensional signal to assess, the trier of fact now
has one for each cell. In the present thesis we take one step towards enabling the res-
olution of the complex mixture problem by proposing and assessing two methodologies
that would facilitate the downstream analysis of genetic signal from a collection of single
cells. Our goal is to query whether it is possible to use unsupervised machine learning to
accurately and efficiently gather single cell signals into groups by genotype. If possible,
it would greatly reduce the computational complexity of the evaluation of evidence and
improve its accuracy. The results in this thesis suggest that this approach is viable and
advances the potential of this societally important technology.
Item Type: |
Thesis
(Masters)
|
Keywords: |
Clustering Single-Cell Electropherograms;
Genotype; Unsupervised Machine Learning; |
Academic Unit: |
Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
Item ID: |
14919 |
Depositing User: |
IR eTheses
|
Date Deposited: |
13 Oct 2021 11:20 |
URI: |
|
Use Licence: |
This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available
here |
Repository Staff Only(login required)
|
Item control page |
Downloads per month over past year
Origin of downloads