Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Sensitive Data Recognition and Filtering Model of Webpage Content Based on Decision Tree Algorithm

  • Conference paper
  • First Online:
Big Data and Security (ICBDS 2020)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1415))

Included in the following conference series:

  • 932 Accesses

Abstract

In recent years, privacy-protecting data mining has attracted widespread concern because it is necessary to provide protection for the privacy level of sensitive and confidential data from unauthorized attacks. The purpose of this study is to develop a privacy-protecting anonymity algorithm using decision tree classification. This paper focuses on k-anonymity technology, which can prevent identity leakage. K-anonymity technology adopts generalization and suppression methods to achieve data anonymity. Then, the privacy level and mining quality of anonymous data sets will be tested by using decision tree classification, and then compared with other data mining technologies (logistic regression and support vector machine). As is shown in the research, compared with other data mining technologies, privacy level and data quality provide better results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xu, N., Wang, H., Cen, L., et al.: Discussing the recognition method of sensitive data. Comput. Inf. Technol. 027(002), 14–15, 59 (2019)

    Google Scholar 

  2. Huang, A., Chen, X.: An improved ID3 algorithm of decision trees . Comput. Eng. Sci. 31(6), 109–111 (2009)

    Google Scholar 

  3. Song, F., Ma, T., Tian, Y., et al.: A new method of privacy protection: random k-anonymous. IEEE Access 7, 75434–75445 (2019)

    Article  Google Scholar 

  4. Prasser, F., Kohlmayer, F.: Putting statistical disclosure control into practice: the ARX data anonymization tool. In: Gkoulalas-Divanis, A., Loukides, G. (eds.) Medical Data Privacy Handbook, pp. 111–148. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23633-9_6

    Chapter  Google Scholar 

  5. Heckerman, D.: Bayesian networks for data mining. data mining and knowledge discovery. Data Mining Knowl. Discov. 1(1), 79–119 (1997)

    Google Scholar 

  6. Holmes, G., Donkin, A., Witten, I.H . WEKA: a machine learning workbench. In: Proceedings of ANZIIS 1994 - Australian New Zealnd Intelligent Information Systems Conference, pp. 357–361 (1994)

    Google Scholar 

  7. Aggarwal, C.C., Yu, P.S.: A general survey of privacy-preserving data mining models and algorithms. J. Vasc. Surg. 8(1), 64–70 (2008)

    Google Scholar 

  8. Milman, Y.: Minimum number of operations needed to identify an object in an array. J. Biotechnol. 85(2):103–13 (1968)

    Google Scholar 

  9. Chauhan, V.K., Dahiya, K., Sharma, A.: Problem formulations and solvers in linear SVM: a review. Artif. Intell. Rev. (2018)

    Google Scholar 

  10. Song, L., Ma, C., Duan, G., et al.: Privacy-preserving logistic regression on vertically partitioned data . J. Comput. Res. Dev. 56(10), 2243–2249 (2019)

    Google Scholar 

  11. Vaidya, J., Shafiq, B., Fan, W., et al.: A random decision tree framework for privacy-preserving data mining. IEEE Trans. Depend. Secure Comput. 11(5), 399–411 (2014)

    Article  Google Scholar 

  12. Zhan, J.: Using homomorphic encryption for privacy-preserving collaborative decision tree classiffication. In: 2007 IEEE Symposium on Computational Intelligence and Data Mining, pp. 637–645 (2007)

    Google Scholar 

Download references

Acknowledgement

This paper is supported by the science and technology project of State Grid Corporation of China: “Research and Application of Key Technology of Data Sharing and Distribution Security for Data Center” (Grand No. 5700-202090192A-0–0-00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qian Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ye, S., Cheng, Y., Yang, Y., Guo, Q. (2021). Sensitive Data Recognition and Filtering Model of Webpage Content Based on Decision Tree Algorithm. In: Tian, Y., Ma, T., Khan, M.K. (eds) Big Data and Security. ICBDS 2020. Communications in Computer and Information Science, vol 1415. Springer, Singapore. https://doi.org/10.1007/978-981-16-3150-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-3150-4_42

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-3149-8

  • Online ISBN: 978-981-16-3150-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics