ABSTRACT
For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at runtime. However, the training of the models has been historically limited to only those features available at runtime. In this paper, we consider an alternate learning approach that trains models using privileged information--features available at training time but not at runtime--to improve the accuracy and resilience of detection systems. In particular, we adapt and extend recent advances in knowledge transfer, model influence, and distillation to enable the use of forensic or other data unavailable at runtime in a range of security domains. An empirical evaluation shows that privileged information increases precision and recall over a system with no privileged information: we observe up to 7.7% relative decrease in detection error for fast-flux bot detection, 8.6% for malware traffic detection, 7.3% for malware classification, and 16.9% for face recognition. We explore the limitations and applications of different privileged information techniques in detection systems. Such techniques provide a new means for detection systems to learn from data that would otherwise not be available at runtime.
- Mansour A. et almbox.. 2015. Novel feature extraction, selection and fusion for effective malware family classification. arXiv preprint arXiv:1511.04317.Google Scholar
- R. Barbosa, R. Sadre, A. Pras, and R. Meent. 2010. University of Twente traffic traces data repository University of Twente Tech Report.Google Scholar
- J. Bickford et almbox.. 2011. Security versus energy tradeoffs in host-based mobile malware detection Mobile systems, applications, and services.Google Scholar
- Equifax Data Breach. 2018. https://en.wikipedia.org/wiki/Equifax. (2018). {Online; accessed 15-January-2018}.Google Scholar
- I. Butun et almbox.. 2014. A survey of intrusion detection systems in wireless sensor networks. IEEE Communications Surveys & Tutorials.Google Scholar
- A. A. Cardenas et almbox.. 2013. Big data analytics for security. IEEE System Security.Google Scholar
- Z. B. Celik, R. Izmailov, and P. McDaniel. 2015 a. Proof and implementation of algorithmic realization of learning using privileged information (LUPI) paradigm: SVM. Technical Report NAS-TR-0187-2015. NSCR, CSE, PSU.Google Scholar
- Z. Berkay Celik, David Lopez-Paz, and Patrick McDaniel. 2016. Patient-driven privacy control through generalized distillation. IEEE Symposium on Privacy-Aware Computing (PAC).Google Scholar
- Z. B. Celik, P. McDaniel, and R. Izmailov. 2017. Feature cultivation in privileged information-augmented detection ACM CODASPY IWSPA. Google ScholarDigital Library
- Z. Berkay Celik, Patrick McDaniel, Rauf Izmailov, Nicolas Papernot, Ryan Sheatsley, Raquel Alvarez, and Ananthram Swami. 2018. Detection under privileged information (Extended Version).Google Scholar
Index Terms
- Detection under Privileged Information
Recommendations
Feature Cultivation in Privileged Information-augmented Detection
IWSPA '17: Proceedings of the 3rd ACM on International Workshop on Security And Privacy AnalyticsModern detection systems use sensor outputs available in the deployment environment to probabilistically identify attacks. These systems are trained on past or synthetic feature vectors to create a model of anomalous or normal behavior. Thereafter, run-...
Malware detection using adaptive data compression
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISecA popular approach in current commercial anti-malware software detects malicious programs by searching in the code of programs for scan strings that are byte sequences indicative of malicious code. The scan strings, also known as the signatures of ...
Opcode-sequence-based semi-supervised unknown malware detection
CISIS'11: Proceedings of the 4th international conference on Computational intelligence in security for information systemsMalware is any computer software potentially harmful to both computers and networks. The amount of malware is growing every year and poses a serious global security threat. Signature-based detection is the most extended method in commercial antivirus ...
Comments