C-sanitized: a privacy model for document redaction and sanitization
release_3j4oqidekjbgxbcikrctnxspb4
by
David Sanchez, Montserrat Batet
2014
Abstract
Within the current context of Information Societies, large amounts of
information are daily exchanged and/or released. The sensitive nature of much
of this information causes a serious privacy threat when documents are
uncontrollably made available to untrusted third parties. In such cases,
appropriate data protection measures should be undertaken by the responsible
organization, especially under the umbrella of current legislations on data
privacy. To do so, human experts are usually requested to redact or sanitize
document contents. To relieve this burdensome task, this paper presents a
privacy model for document redaction/sanitization, which offers several
advantages over other models available in the literature. Based on the
well-established foundations of data semantics and the information theory, our
model provides a framework to develop and implement automated and inherently
semantic redaction/sanitization tools. Moreover, contrary to ad-hoc redaction
methods, our proposal provides a priori privacy guarantees which can be
intuitively defined according to current legislations on data privacy.
Empirical tests performed within the context of several use cases illustrate
the applicability of our model and its ability to mimic the reasoning of human
sanitizers.
In text/plain
format
Archived Files and Locations
application/pdf 717.3 kB
file_365xtigtqbaofjgtgiopnobu7u
|
arxiv.org (repository) web.archive.org (webarchive) |
1406.4285v1
access all versions, variants, and formats of this works (eg, pre-prints)