Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1858996.1859013acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automatic construction of an effective training set for prioritizing static analysis warnings

Authors Info & Claims
Published:20 September 2010Publication History

ABSTRACT

In order to improve ineffective warning prioritization of static analysis tools, various approaches have been proposed to compute a ranking score for each warning. In these approaches, an effective training set is vital in exploring which factors impact the ranking score and how. While manual approaches to build a training set can achieve high effectiveness but suffer from low efficiency (i.e., high cost), existing automatic approaches suffer from low effectiveness. In this paper, we propose an automatic approach for constructing an effective training set. In our approach, we select three categories of impact factors as input attributes of the training set, and propose a new heuristic for identifying actionable warnings to automatically label the training set. Our empirical evaluations show that the precision of the top 22 warnings for Lucene, 20 for ANT, and 6 for Spring can achieve 100% with the help of our constructed training set.

References

  1. }}C. Artho. Jlint - Find Bugs in Java Programs. http://Jlint.sourceforge.net/.Google ScholarGoogle Scholar
  2. }}N. Ayewah, D. Hovemeyer, J. D. Morgenthaler, J. Penix, and W. Pugh. Using static analysis to find bugs. IEEE Software, vol. 25, no. 5, pages 22--29, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. }}C. Boogerd and L. Moonen. Prioritizing software inspection results using static profiling. In Proc. SCAM, pages 149--160, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. }}D. Binkley. Source code analysis: a road map. In Proc. FOSE, pages 104--119, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. }}J. Bevan, E. J. Whitehead, Jr., S. Kim, and M. Godfrey. Identifying changed source code lines from revision repositories. In Proc. ESEC/FSE, pages 177--186, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. }}B. Chess and J. West. Secure programming with static analysis. Aaison Wesley, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. }}D. Cubranic and G. C. Murphy. Hipikat: recommending pertinent software development artifacts. In Proc. ICSE, pages 408--418, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. }}K. Chen, S. R. Schach, L. Yu, J. Offutt, and G. Z. Heller. Open-source change logs. Empirical Software Engineering, vol. 9, no. 3, pages 197--210, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. }}D. Engler, B. Chelf, A. Chou, and S. Hallem. Bugs as deviate behavior: A general approach to inferring errors in system code. In Proc. SOSP, pages 57--72, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. }}D. Engler and M. Musuvathi. Static analysis versus software model checking for bug finding. In Proc. VMCAI, pages 191--210, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  11. }}M. Fischer, M. Pinzger, and H. Gall. Populating a release history database from revision control and bug tracking systems. In Proc. ICSM, pages 23--32, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. }}FindBugs, available at http://findbugs.sourceforge.net/.Google ScholarGoogle Scholar
  13. }}Fortify, available at http://www.fortify.net/intro.html.Google ScholarGoogle Scholar
  14. }}K. Hornik, M. Stinchcombe and H. White. Multilayer feed-forward networks are universal approximators. Neural Networks, vol. 2, pages 359--366, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. }}D. Hovemeyer and W. Pugh. Finding bugs is easy. In Proc. OOPSLA, pages 132--136, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. }}S. Heckman and L. Williams. On establishing a benchmark for evaluating static analysis alert prioritization and classification techniques. In Pro. ESEM, pages 41--50, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. }}S. S. Heckman. Adaptively ranking alerts generated from automated static analysis. ACM Crossroads, 14(1), pages 1--11, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. }}S. Kim and M. D. Ernst. Which warnings should I fix first? In Proc. ESEC/FSE, pages 45--54, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. }}S. Kim and M. D. Ernst. Prioritizing warning categories by analyzing software history. In Proc. MSR, pages 27--30, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. }}T. Kremenek, K. Ashcraft, J. Yang and D. Engler. Correlation exploitation in error ranking. In Proc. FSE, pages 83--93, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. }}T. Kremenek and D. R. Engler. Z-ranking: using statistical analysis to counter the impact of static analysis approximations. In Proc. SAS, pages 295--315, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. }}Lint4j, available at http://www.jutils.com/.Google ScholarGoogle Scholar
  23. }}A. Mockus and L. G. Votta. Identifying reasons for software changes using historic databases. In Proc. ICSM, pages 120--130, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. }}PMD, available at http://pmd.sourceforge.net/.Google ScholarGoogle Scholar
  25. }}J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum, and G. Rothermel. Predicting accurate and actionable static analysis warnings: an experimental approach. In Proc. ICSE, pages 341--350, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. }}N. Rutar, C. B. Almazan, and J. S. Foster. A comparison of bug finding tools for Java. In Proc. ISSRE, pages 245--256, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. }}G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of the ACM, vol.18, no.11, pages 613--620, 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. }}S. E. Sim, S. Easterbrook, and R. C. Holt. Using benchmarking to advance research: a challenge to software engineering, In Proc. ICSE, pages 74--83, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. }}J. Spacco, D. Hovemeyer, and W. Pugh. Tracking defect warnings across revisions. In Proc. MSR, pages 133--136, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. }}J. Sliwerski, T. Zimmermann and A. Zeller. When do changes induce fixes? In Proc. MSR 2005, pages 1--5, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. }}Weka, available at http://www.cs.waikato.ac.nz/~ml/weka/Google ScholarGoogle Scholar
  32. }}C. C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve static analysis techniques. IEEE Trans. Software Engineering, vol. 31, no. 6, pages 466--480, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic construction of an effective training set for prioritizing static analysis warnings

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  ASE '10: Proceedings of the 25th IEEE/ACM International Conference on Automated Software Engineering
                  September 2010
                  534 pages
                  ISBN:9781450301169
                  DOI:10.1145/1858996

                  Copyright © 2010 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 20 September 2010

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article

                  Acceptance Rates

                  Overall Acceptance Rate82of337submissions,24%

                  Upcoming Conference

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader