Senavirathne, Navoda and Torra, Vicenç
(2019)
Rounding based continuous data discretization for statistical disclosure control.
Journal of Ambient Intelligence and Humanized Computing.
ISSN 1868-5137
Abstract
“Rounding” can be understood as a way to coarsen continuous data. That is, low level and infrequent values are replaced by
high-level and more frequent representative values. This concept is explored as a method for data privacy with techniques
like rounding, microaggregation, and generalisation. This concept is explored as a method for data privacy in statistical
disclosure control literature with perturbative techniques like rounding, microaggregation and non-perturbative methods like
generalisation. Even though “rounding” is well known as a numerical data protection method, it has not been studied in depth
or evaluated empirically to the best of our knowledge. This work is motivated by three objectives, (1) to study the alternative
methods of obtaining the rounding values to represent a given continuous variable, (2) to empirically evaluate rounding as
a data protection technique based on information loss (IL) and disclosure risk (DR), and (3) to analyse the impact of data
rounding on machine learning based models. Here, in order to obtain the rounding values we consider discretization methods
introduced in the unsupervised machine learning literature along with microaggregation and re-sampling based approaches.
The results indicate that microaggregation based techniques are preferred over unsupervised discretization methods due to
their fair trade-off between IL and DR.
Item Type: |
Article
|
Additional Information: |
Cite as: Senavirathne, N., Torra, V. Rounding based continuous data discretization for statistical disclosure control. J Ambient Intell Human Comput (2019). https://doi.org/10.1007/s12652-019-01489-7 |
Keywords: |
Rounding for micro data; Unsupervised discretization; Micro data protection; |
Academic Unit: |
Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
Item ID: |
14065 |
Identification Number: |
https://doi.org/10.1007/s12652-019-01489-7 |
Depositing User: |
Vicenç Torra
|
Date Deposited: |
24 Feb 2021 15:16 |
Journal or Publication Title: |
Journal of Ambient Intelligence and Humanized Computing |
Publisher: |
Springer |
Refereed: |
Yes |
URI: |
|
Use Licence: |
This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available
here |
Repository Staff Only(login required)
|
Item control page |
Downloads per month over past year
Origin of downloads