Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3041008.3041018acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Feature Cultivation in Privileged Information-augmented Detection

Published:24 March 2017Publication History

ABSTRACT

Modern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-time collected sensor outputs are compared to the model to identify attacks (or the lack of attack). While this approach to detection has been proven to be effective in many environments, it is limited to training on only features that can be reliably collected at detection time. Hence, they fail to leverage the often vast amount of ancillary information available from past forensic analysis and post-mortem data. In short, detection systems do not train (and thus do not learn from) features that are unavailable or too costly to collect at run-time. Recent work proposed an alternate model construction approach that integrates forensic "privilege" information---features reliably available at training time, but not at run-time---to improve accuracy and resilience of detection systems. In this paper, we further evaluate two of proposed techniques to model training with privileged information: knowledge transfer, and model influence. We explore the cultivation of privileged features, the efficiency of those processes and their influence on the detection accuracy. We observe that the improved integration of privileged features makes the resulting detection models more accurate. Our evaluation shows that use of privileged information leads to up to 8.2% relative decrease in detection error for fast-flux bot detection over a system with no privileged information, and 5.5% for malware classification.

References

  1. A. A. Cardenas, P. K. Manadhata, and S. P. Rajan. Big data analytics for security. Proc. IEEE Security & Privacy, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Richard Zuech, Taghi M Khoshgoftaar, and Randall Wald. Intrusion detection and big heterogeneous data: a survey. Journal of Big Data, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  3. Z. Berkay Celik, Patrick McDaniel, Rauf Izmailov, Nicolas Papernot, and Ananthram Swami. Extending detection with forensic information. arXiv:1603.09638, 2016.Google ScholarGoogle Scholar
  4. Vladimir Vapnik and Rauf Izmailov. Learning using privileged information: Similarity control and knowledge transfer. Journal of ML Research, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Vladimir Vapnik and Akshay Vashist. A new learning paradigm: Learning using privileged information. Neural Networks, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ting-Fang Yen, Alina Oprea, Kaan Onarlioglu, Todd Leetham, William Robertson, Ari Juels, and Engin Kirda. Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks. In Proc. Computer Security Applications. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jeffrey Bickford, H Andrés Lagar-Cavilla, Alexander Varshavsky, Vinod Ganapathy, and Liviu Iftode. Security versus energy tradeoffs in host-based mobile malware detection. In Proc. Mobile systems, applications, and services. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. V. Sharmanska, N. Quadrianto, and C. H. Lampert. Learning to rank using privileged information. In Proc. International Conference on Computer Vision (ICCV), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Christopher M. Bishop. Pattern recognition and machine learning. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. D. A. Belsley, E. Kuh, and R. E. Welsch. Regression diagnostics: Identifying influential data and sources of collinearity. John Wiley & Sons, 2005.Google ScholarGoogle Scholar
  11. Michael Friendly and Ernest Kwan. Where's Waldo? Visualizing collinearity diagnostics. The American Statistician, 2009.Google ScholarGoogle Scholar
  12. Z. Berkay Celik, Rauf Izmailov, and Patrick McDaniel. Proof and Implementation of Algorithmic Realization of Learning Using Privileged Information (LUPI) Paradigm: SVMGoogle ScholarGoogle Scholar
  13. . Technical Report NAS-TR-0187--2015, CSE Department, PSU, December 2015.Google ScholarGoogle Scholar
  14. Z. Berkay Celik and Sema Oktug. Detection of Fast-Flux Networks using various DNS feature sets. In Proc. IEEE Symposium on Computers and Communications (ISCC), 2013.Google ScholarGoogle ScholarCross RefCross Ref
  15. Microsoft malware classification challenge. https://www.kaggle.com/c/malware-classification/. {Online; accessed 10-May-2015}.Google ScholarGoogle Scholar
  16. Ida pro: Disassembler and debugger. http://www.hex-rays.com/idapro/.Google ScholarGoogle Scholar
  17. Lei Yu and Huan Liu. Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proc. International Conference on Machine Learning (ICML), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M Zubair Rafique and Juan Caballero. Firma: Malware clustering and network signature generation with mixed network behaviors. In Proc. RAID. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nir Nissim, Robert Moskovitch, Lior Rokach, and Yuval Elovici. Novel active learning methods for enhanced pc malware detection in windows os. Expert Systems with Applications, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mansour Ahmadi, Giorgio Giacinto, Dmitry Ulyanov, Stanislav Semenov, and Mikhail Trofimov. Novel feature extraction, selection and fusion for effective malware family classification. arXiv preprint arXiv:1511.04317, 2015.Google ScholarGoogle Scholar
  21. Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. From throw-away traffic to bots: detecting the rise of dga-based malware. In Proc. USENIX Security, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Leyla Bilge, Engin Kirda, Christopher Kruegel, and Marco Balduzzi. Exposure: Finding malicious domains using passive dns analysis. In Proc. NDSS, 2011.Google ScholarGoogle Scholar
  23. Sandeep Yadav, Ashwath Kumar Krishna Reddy, AL Reddy, and Supranamaya Ranjan. Detecting algorithmically generated malicious domain names. In Proc. ACM Internet measurement, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Z. Berkay Celik, Jayaram Raghuram, George Kesidis, and David J Miller. Salting public traces with attack traffic to test flow classifiers. In Proc. Usenix Cyber Security Experimentation and Test, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Z. Berkay Celik, Robert J Walls, Patrick McDaniel, and Ananthram Swami. Malware traffic detection using tamper resistant features. In Proc. IEEE Military Communications Conference (MILCOM), 2015.Google ScholarGoogle Scholar
  26. Ziheng Wang and Qiang Ji. Classifier learning with hidden information. In Proc. IEEE Computer Vision and Pattern Recognition, 2015.Google ScholarGoogle Scholar
  27. Daniel Hernández-Lobato, Viktoriia Sharmanska, Kristian Kersting, Christoph H Lampert, and Novi Quadrianto. Mind the nuisance: Gaussian process classification using privileged noise. In Proc. Advances in Neural Information Processing Systems, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, and Vladimir Vapnik. Unifying distillation and privileged information. arXiv preprint arXiv:1511.03643, 2015.Google ScholarGoogle Scholar
  29. Z. Berkay Celik, David Lopez-Paz, and Patrick McDaniel. Patient-driven privacy control through generalized distillation. arXiv:1611.08648, 2016.Google ScholarGoogle Scholar

Index Terms

  1. Feature Cultivation in Privileged Information-augmented Detection

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics
        March 2017
        88 pages
        ISBN:9781450349093
        DOI:10.1145/3041008

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 March 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        IWSPA '17 Paper Acceptance Rate4of14submissions,29%Overall Acceptance Rate18of58submissions,31%

        Upcoming Conference

        CODASPY '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader