ABSTRACT
We examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and produce exactly the same results as if applied to the original data e.g. distance-based clustering, k-nearest neighbor classification. However, the issue of how well the original data is hidden has, to our knowledge, not been carefully studied. We take a step in this direction by assuming the role of an attacker armed with two types of prior information regarding the original data. We examine how well the attacker can recover the original data from the transformed data and prior information. Our results offer insight into the vulnerabilities of distance preserving transformations.
- Agrawal, R., Srikant, R.: Privacy preserving data mining. In: Proc. ACM SIGMOD. (2000) 439-450 Google ScholarDigital Library
- Kargupta, H., Datta, S., Wang, Q., Sivakumar, K.: Random data perturbation techniques and privacy preserving data mining. Knowledge and Information Systems 7(5) (2005) 387-414 Google ScholarDigital Library
- Huang, Z., Du, W., Chen, B.: Deriving private information from randomized data. In: Proc. ACM SIGMOD. (2005) 37-48 Google ScholarDigital Library
- Sweeney, L.: K-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems 10 (5) (2002) 557-570 Google ScholarDigital Library
- Chen, K., Liu, L.: Privacy preserving data classification with rotation perturbation. In: Proc. IEEE ICDM. (2005) 589-592 Google ScholarDigital Library
- Oliveira, S.R.M., Zaïane, O.R.: Privacy preservation when sharing data for clustering. In: Proc. Workshop on Secure Data Management in a Connected World. (2004) 67-82Google Scholar
- Artin, M.: Algebra. Prentice Hall (1991)Google Scholar
- N. R. Adam, J.C.W.: Security-control methods for statistical databases: A comparative study. ACM Computing Surveys 21 (4) (1989) 515-556 Google ScholarDigital Library
- Jolliffe, I.T.: Principal Component Analysis. Second edn. Springer Series in Statistics. Springer (2002)Google Scholar
- G. Strang: Linear Algebra and Its Applications (3rd Ed.). Harcourt Brace Jovanovich College Publishers, New York (1986)Google Scholar
- Szekély, G.J., Rizzo, M.L.: Testing for equal distributions in high dimensions. InterStat November(5) (2004)Google Scholar
- Vaidya, J., Clifton, C., Zhu, M.: Privacy Preserving Data Mining. Volume 19 of Series: Advances in Information Security. Springer (2006)Google Scholar
- Kim, J.J., Winkler, W.E.: Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census (2003)Google Scholar
- Liu, K., Kargupta, H., Ryan, J.: Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining. IEEE Transactions on Knowledge and Data Engineering 18 (1) (2006) 92-106 Google ScholarDigital Library
- Evfimevski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy preserving data mining. In: Proc. ACM PODS. (2003) Google ScholarDigital Library
- Rizvi, S.J., Haritsa, J.R.: Maintaining data privacy in association rule mining. In: Proc. 28th VLDB. (2002) 682-693 Google ScholarDigital Library
- Hore, B., Mehrotra S., Tsudik G.: A privacy-preserving index for range queries. In: Proc. 30th VLDB. (2004) 720-731 Google ScholarDigital Library
- Verykios, V.S., Elmagarmid, A.K., Elisa, B., Saygin, Y., Elena, D.: Association rule hiding. IEEE Transactions on Knowledge and Data Engineering 16 (4) (2004) 434-447 Google ScholarDigital Library
- Fienberg, S.E., McIntyre, J.: Data swapping: Variations on a theme by dalenius and reiss. Technical report, U.S. National Institute of Statistical Sciences (2003)Google Scholar
Recommendations
Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects
ICCCT '12: Proceedings of the 2012 Third International Conference on Computer and Communication TechnologyPrivacy preserving has originated as an important concern with reference to the success of the data mining. Privacy preserving data mining (PPDM) deals with protecting the privacy of individual data or sensitive knowledge without sacrificing the utility ...
An attacker's view of distance preserving maps for privacy preserving data mining
ECMLPKDD'06: Proceedings of the 10th European Conference on Principles and Practice of Knowledge Discovery in DatabasesWe examine the effectiveness of distance preserving transformations in privacy preserving data mining. These techniques are potentially very useful in that some important data mining algorithms can be efficiently applied to the transformed data and ...
Comments