ABSTRACT
Due to increased adoption of cloud computing, there is a growing need of addressing the data privacy during mining. On the other hand, knowledge sharing is a key to survive many business organizations. Several attempts have been made to mine the data in distributed environment however, maintaining the privacy while mining the data over cloud is a challenging task. In this paper, we present an efficient and practical cryptographic based scheme that preserves privacy and mine the cloud data which is distributed in nature. In order to address the classification task, our approach uses k-NN classifier. We extend the Jaccard measure to find the similarity between two encrypted and distributed records by conducting an equality test. In addition, our approach accelerates mining by finding nearest neighbours at local and then at global level. The proposed approach avoids transmitting the original data and sharing of the key that is required in traditional crypto based privacy preserving data mining solutions.
- R. Agrawal and R. Srikant. 2000. Privacy-preserving data mining. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 439--450. ACM Press. Google ScholarDigital Library
- B. Pinkas. 2002. Cryptographic techniques for privacy-preserving data mining. SIGKDD Explor. Newsl., 4(2):12--19. Google ScholarDigital Library
- C. Clifton, M. Kantarcioglu, J. Vaidya, X. Lin, and M. Y. Zhu. 2002. Tools for privacy preserving distributed data mining. SIGKDD Explor. Newsl., 4(2):28--34. Google ScholarDigital Library
- N. P. Kumar, M. V. Rao, P. R. Krishna, and R. S. Bapi. 2005. Using sub-sequence information with kNN for classification of sequential data. In Distributed Computing and Internet Technology, Second International Conference, ICDCIT 2005, Bhubaneswar, India,. Google ScholarDigital Library
- P. Paillier. 1991. Public key cryptosystem based on composite degree residuosity classes. In Eurocrypt'99, pages 223--228. Springer.Google Scholar
- S. Pearson. 2009. Taking account of privacy when designing cloud computing services. In CLOUD'09: Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing. Google ScholarDigital Library
- R. L. Rivest, A. Shamir, and L. Adleman. 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM, 21(2):120--126. Google ScholarDigital Library
- P. Samarati. 2001. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010--1027. Google ScholarDigital Library
- M. Singh, P. R. Krishna, and A. Saxena. 2009. A privacy preserving Jaccard similarity function for mining encrypted data. In proceedings of TENCON 2009, Singapore. November 23--26, IEEE.Google Scholar
- W. K. Wong, D. W.-l. Cheung, B. Kao, and N. Mamoulis. 2009. Secure kNN computation on encrypted databases. In SIGMOD '09: Proceedings of the 35th SIGMOD international conference on Management of data, pages 139--152, New York, NY, USA. ACM. Google ScholarDigital Library
- A cryptography based privacy preserving solution to mine cloud data
Recommendations
Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects
ICCCT '12: Proceedings of the 2012 Third International Conference on Computer and Communication TechnologyPrivacy preserving has originated as an important concern with reference to the success of the data mining. Privacy preserving data mining (PPDM) deals with protecting the privacy of individual data or sensitive knowledge without sacrificing the utility ...
Privacy preserving data mining - past and present
Data mining is the process of discovering patterns and correlations within the huge volume of data to forecast the outcomes. There are serious challenges occurring in data mining techniques due to privacy violation and sensitive information disclosure ...
Reversible privacy preserving data mining: a combination of difference expansion and privacy preserving
Privacy Preserving Data Mining (PPDM) can prevent private data from disclosure in data mining. However, the current PPDM methods damaged the values of original data where knowledge from the mined data cannot be verified from the original data. In this ...
Comments