Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Context-Aware Code Change Embedding for Better Patch Correctness Assessment

Published:18 May 2022Publication History
Skip Abstract Section

Abstract

Despite the capability in successfully fixing more and more real-world bugs, existing Automated Program Repair (APR) techniques are still challenged by the long-standing overfitting problem (i.e., a generated patch that passes all tests is actually incorrect). Plenty of approaches have been proposed for automated patch correctness assessment (APCA). Nonetheless, dynamic ones (i.e., those that needed to execute tests) are time-consuming while static ones (i.e., those built on top of static code features) are less precise. Therefore, embedding techniques have been proposed recently, which assess patch correctness via embedding token sequences extracted from the changed code of a generated patch. However, existing techniques rarely considered the context information and program structures of a generated patch, which are crucial for patch correctness assessment as revealed by existing studies. In this study, we explore the idea of context-aware code change embedding considering program structures for patch correctness assessment. Specifically, given a patch, we not only focus on the changed code but also take the correlated unchanged part into consideration, through which the context information can be extracted and leveraged. We then utilize the AST path technique for representation where the structure information from AST node can be captured. Finally, based on several pre-defined heuristics, we build a deep learning based classifier to predict the correctness of the patch. We implemented this idea as Cache and performed extensive experiments to assess its effectiveness. Our results demonstrate that Cache can (1) perform better than previous representation learning based techniques (e.g., Cache relatively outperforms existing techniques by \( \approx \)6%, \( \approx \)3%, and \( \approx \)16%, respectively under three diverse experiment settings), and (2) achieve overall higher performance than existing APCA techniques while even being more precise than certain dynamic ones including PATCH-SIM (92.9% vs. 83.0%). Further results reveal that the context information and program structures leveraged by Cache contributed significantly to its outstanding performance.

REFERENCES

  1. [1] Abreu Rui, Zoeteweij Peter, and Gemund Arjan JC Van. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION. IEEE, 8998. Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Alon Uri, Brody Shaked, Levy Omer, and Yahav Eran. 2019. code2seq: Generating sequences from structured representations of code. In Proceedings of the 7th International Conference on Learning Representations. OpenReview.net.Google ScholarGoogle Scholar
  3. [3] Alon Uri, Sadaka Roy, Levy Omer, and Yahav Eran. 2020. Structural language models of code. In Proceedings of 37th International Conference on Machine Learning.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Alon Uri, Zilberstein Meital, Levy Omer, and Yahav Eran. 2018. A general path-based representation for predicting program properties. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 404419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Alon Uri, Zilberstein Meital, Levy Omer, and Yahav Eran. 2019. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages 3, POPL (2019), 40:1–40:29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Alshammari Abdulrahman, Morris Christopher, Hilton Michael, and Bell Jonathan. 2021. FlakeFlagger: Predicting flakiness without rerunning tests. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 15721584. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Bader Johannes, Scott Andrew, Pradel Michael, and Chandra Satish. 2019. Getafix: Learning to fix bugs automatically. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 159:1–159:27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Breiman Leo, Friedman Jerome H., Olshen Richard A., and Stone Charles J.. 1984. Classification and Regression Trees. Routledge. Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Brody Shaked, Alon Uri, and Yahav Eran. 2020. A structural model for contextual code changes. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Chakraborty Saikat, Ding Yangruibo, Allamanis Miltiadis, and Ray Baishakhi. 2020. CODIT: Code editing with tree-based neural models. IEEE Transactions on Software Engineering (2020).Google ScholarGoogle Scholar
  11. [11] Chen Lingchao, Ouyang Yicheng, and Zhang Lingming. 2021. Fast and precise on-the-fly patch validation for all. In Proceedings of the 43rd International Conference on Software Engineering.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Chen Liushan, Pei Yu, and Furia Carlo A.. 2017. Contract-based program repair without the contracts. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. 637647. Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Chen Zimin, Kommrusch Steve James, Tufano Michele, Pouchet Louis-Noël, Poshyvanyk Denys, and Monperrus Martin. 2019. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Trans. on Software Engineering (2019).Google ScholarGoogle Scholar
  14. [14] Csuvik Viktor, Horváth Dániel, Horváth Ferenc, and Vidács László. 2020. Utilizing source code embeddings to identify correct patches. In Proceedings of the 2nd International Workshop on Intelligent Bug Fixing. IEEE, 1825. Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 41714186. Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Durieux Thomas, Madeiral Fernanda, Martinez Matias, and Abreu Rui. 2019. Empirical review of Java program repair tools: A large-scale experiment on 2,141 bugs and 23,551 repair attempts. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 302313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Durieux Thomas and Monperrus Martin. 2016. DynaMoth: Dynamic code synthesis for automatic program repair. In Proceedings of the 11th International Workshop in Automation of Software Test. ACM, 8591. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Durieux Thomas and Monperrus Martin. 2016. IntroClassJava: A benchmark of 297 small and buggy Java programs. In Technical Report #hal-01272126. University of Lille.Google ScholarGoogle Scholar
  19. [19] Ernst Michael D., Cockrell Jake, Griswold William G., and Notkin David. 2001. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering 27, 2 (2001), 99123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Ernst Michael D., Cockrell Jake, Griswold William G., and Notkin David. 2001. Dynamically discovering likely program invariants to support program evolution. IEEE Transactions on Software Engineering 27, 2 (Feb. 2001), 99123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Falleri Jean-Rémy, Morandat Floréal, Blanc Xavier, Martinez Matias, and Monperrus Martin. 2014. Fine-grained and accurate source code differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering. ACM, 313324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Fan Yuanrui, Xia Xin, Lo David, and Hassan Ahmed E.. 2018. Chaff from the wheat: Characterizing and determining valid bug reports. IEEE Transactions on Software Engineering 46, 5 (2018), 495525.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Fraser Gordon and Arcuri Andrea. 2011. EvoSuite: Automatic test suite generation for object-oriented software. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering. 416419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Guo Anbang, Mao Xiaoguang, Yang Deheng, and Wang Shangwen. 2018. An empirical study on the effect of dynamic slicing on automated program repair efficiency. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 554558.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Ho Tin Kam. 1995. Random decision forests. In Proceedings of 3rd International Conference on Document Analysis and Recognition, Vol. 1. IEEE, 278282.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Hoang Thong, Kang Hong Jin, Lawall Julia, and Lo David. 2020. CC2Vec: Distributed representations of code changes. In Proceedings of the 42nd International Conference on Software Engineering. ACM, 518529. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Hua Jinru, Zhang Mengshi, Wang Kaiyuan, and Khurshid Sarfraz. 2018. Towards practical program repair with on-demand candidate generation. In Proceedings of the 40th International Conference on Software Engineering. ACM, 1223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Jiang Jiajun, Xiong Yingfei, Zhang Hongyu, Gao Qing, and Chen Xiangqun. 2018. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 298309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Just René, Jalali Darioush, and Ernst Michael D.. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 23rd International Symposium on Software Testing and Analysis. ACM, 437440. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Karampatsis Rafael-Michael and Sutton Charles A.. 2020. How often do single-statement bugs occur? the ManySStuBs4J dataset. In Proceedings of the 17th Mining Software Repositories. IEEE. http://arxiv.org/abs/1905.13334.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Kingma Diederik P. and Ba Jimmy. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google ScholarGoogle Scholar
  32. [32] Kleinbaum David G., Dietz K., Gail M., and Mitchel Klein. 2002. Logistic Regression. Springer.Google ScholarGoogle Scholar
  33. [33] Kovalenko Vladimir, Bogomolov Egor, Bryksin Timofey, and Bacchelli Alberto. 2019. PathMiner: A library for mining of path-based representations of code. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 1317.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. [34] Koyuncu Anil, Liu Kui, Bissyandé Tegawendé F., Kim Dongsun, Klein Jacques, Monperrus Martin, and Traon Yves Le. 2018. FixMiner: Mining relevant fix patterns for automated program repair. arXiv preprint arXiv:1810.01791 (2018).Google ScholarGoogle Scholar
  35. [35] Le Quoc V. and Mikolov Tomas. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning. JMLR.org, 11881196. http://proceedings.mlr.press/v32/le14.html.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Le Xuan-Bach D., Bao Lingfeng, Lo David, Xia Xin, Li Shanping, and Pasareanu Corina. 2019. On reliability of patch correctness assessment. In Proceedings of the 41st International Conference on Software Engineering. IEEE, 524535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Le Xuan-Bach D., Chu Duc-Hiep, Lo David, Goues Claire Le, and Visser Willem. 2017. S3: Syntax-and semantic-guided repair synthesis via programming by examples. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. ACM, 593604. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Le Xuan Bach D., Lo David, and Goues Claire Le. 2016. History driven program repair. In Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 213224. Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Le Xuan Bach D., Thung Ferdian, Lo David, and Goues Claire Le. 2018. Overfitting in semantics-based automated program repair. Empirical Software Engineering 23, 5 (2018), 30073033. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Goues Claire Le, Nguyen ThanhVu, Forrest Stephanie, and Weimer Westley. 2012. GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering 38, 1 (2012), 5472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Goues Claire Le, Pradel Michael, and Roychoudhury Abhik. 2019. Automated program repair. Commun. ACM 62, 12 (2019), 5665. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Li Hang. 2011. A short introduction to learning to rank. IEICE Transactions on Information and Systems 94, 10 (2011), 18541862.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Li Yi, Wang Shaohua, Nguyen Tien N., and Nguyen Son Van. 2019. Improving bug detection via context-based code representation learning and attention-based neural networks. Proceedings of the ACM on Programming Languages 3, OOPSLA (2019), 162:1–162:30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Lin Bo, Wang Shangwen, Wen Ming, Zhang Zhang, Wu Hongjun, Qin Yihao, and Mao Xiaoguang. 2020. Understanding the non-repairability factors of automated program repair techniques. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 7180. Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Lin Derrick, Koppel James, Chen Angela, and Solar-Lezama Armando. 2017. QuixBugs: A multi-lingual program repair benchmark set based on the Quixey challenge. In Proceedings Companion of the 2017 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity. ACM, 5556. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Liu Kui, Kim Dongsun, Bissyandé Tegawendé F., Kim Tae-young, Kim Kisub, Koyuncu Anil, Kim Suntae, and Traon Yves Le. 2019. Learning to spot and refactor inconsistent method names. In Proceedings of the 41st International Conference on Software Engineering. IEEE, 112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Liu Kui, Kim Dongsun, Koyuncu Anil, Li Li, Bissyandé Tegawendé F., and Traon Yves Le. 2018. A closer look at real-world patches. In Proceedings of the 34th International Conference on Software Maintenance and Evolution. IEEE, 275286. Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Liu Kui, Koyuncu Anil, Bissyandé Tegawendé F., Kim Dongsun, Klein Jacques, and Traon Yves Le. 2019. You cannot fix what you cannot find! An investigation of fault localization bias in benchmarking automated program repair systems. In Proceedings of the 12th IEEE International Conference on Software Testing, Verification and Validation. IEEE, 102113. Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Liu Kui, Koyuncu Anil, Kim Dongsun, and Bissyandé Tegawendé F.. 2019. AVATAR: Fixing semantic bugs with fix patterns of static analysis violations. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering. IEEE, 456467. Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Liu Kui, Koyuncu Anil, Kim Dongsun, and Bissyandé Tegawendé F.. 2019. TBar: Revisiting template-based automated program repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 3142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Liu Kui, Wang Shangwen, Koyuncu Anil, Kim Kisub, Bissyandé Tegawendé F., Kim Dongsun, Wu Peng, Klein Jacques, Mao Xiaoguang, and Traon Yves Le. 2020. On the efficiency of test suite based program repair: A systematic assessment of 16 automated repair systems for Java programs. In Proceedings of the 42nd International Conference on Software Engineering. ACM, 615627. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Liu Xuliang and Zhong Hao. 2018. Mining stackoverflow for program repair. In Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. IEEE, 118129. Google ScholarGoogle ScholarCross RefCross Ref
  53. [53] Liu Zhongxin, Xia Xin, Yan Meng, and Li Shanping. 2020. Automating just-in-time comment updating. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Long Fan and Rinard Martin. 2015. Staged program repair with condition synthesis. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering. ACM, 166178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. [55] Long Fan and Rinard Martin. 2016. An analysis of the search spaces for generate and validate patch generation systems. In Proceedings of the 38th International Conference on Software Engineering. IEEE, 702713. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Lozoya Rocío Cabrera, Baumann Arnaud, Sabetta Antonino, and Bezzi Michele. 2021. Commit2vec: Learning distributed representations of code changes. SN Computer Science 2, 3 (2021), 116.Google ScholarGoogle Scholar
  57. [57] Lutellier Thibaud, Pham Hung Viet, Pang Lawrence, Li Yitong, Wei Moshi, and Tan Lin. 2020. CoCoNuT: Combining context-aware neural translation models using ensemble for program repair. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 101114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Maaten Laurens van der and Hinton Geoffrey. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, (Nov2008), 25792605.Google ScholarGoogle Scholar
  59. [59] Madeiral Fernanda, Urli Simon, Maia Marcelo, and Monperrus Martin. 2019. Bears: An extensible Java bug benchmark for automatic program repair studies. In Proceedings of the 26th International Conference on Software Analysis, Evolution and Reengineering. IEEE, 468478. Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Marinescu Paul Dan and Cadar Cristian. 2013. KATCH: High-coverage testing of software patches. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. 235245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. [61] Martinez Matias and Monperrus Martin. 2016. Astor: A program repair library for Java (demo). In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 441444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. [62] Martinez Matias and Monperrus Martin. 2018. Ultra-large repair search space with automatically mined templates: The Cardumen mode of Astor. In Proceedings of the 10th International Symposium on Search Based Software Engineering. Springer, 6586. Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Monperrus Martin. 2018. The living review on automated program repair. In HAL/Archives-Ouvertes. fr, Technical Report.Google ScholarGoogle Scholar
  64. [64] Nilizadeh Amirfarhad, Leavens Gary T., Le Xuan-Bach D., Pasareanu Corina S., and Cok David R.. 2021. Exploring true test overfitting in dynamic automated program repair using formal methods. In Proceedings of the 14th IEEE International Conference on Software Testing, Verification and Validation.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Pacheco Carlos and Ernst Michael D.. 2007. Randoop: Feedback-directed random testing for Java. In Companion to the 22nd ACM SIGPLAN Conference on Object-oriented Programming Systems and Applications Companion. 815816.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. [66] Patil Tina R. and Sherekar Swati Sunil. 2013. Performance analysis of Naive Bayes and J48 classification algorithm for data classification. International Journal of Computer Science and Applications 6, 2 (2013), 256261.Google ScholarGoogle Scholar
  67. [67] Qi Yuhua, Mao Xiaoguang, Lei Yan, Dai Ziying, and Wang Chengsong. 2014. The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering. ACM, 254265. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. [68] Qin Yihao, Wang Shangwen, Liu Kui, Mao Xiaoguang, and Bissyandé Tegawendé F.. 2021. On the impact of flaky tests in automated program repair. In Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 295306. Google ScholarGoogle ScholarCross RefCross Ref
  69. [69] Rubinstein Reuven Y.. 1999. The cross-entropy method for combinatorial and continuous optimization. Methodology and Computing in Applied Probability 1 (1999), 127190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. [70] Saha Ripon, Lyu Yingjun, Lam Wing, Yoshida Hiroaki, and Prasad Mukul. 2018. Bugs.jar: A large-scale, diverse dataset of real-world Java bugs. In Proceedings of the 15th IEEE/ACM International Conference on Mining Software Repositories. ACM, 1013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. [71] Smith Edward K., Barr Earl T., Goues Claire Le, and Brun Yuriy. 2015. Is the cure worse than the disease? overfitting in automated program repair. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering. ACM, 532543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Sobreira Victor, Durieux Thomas, Madeiral Fernanda, Monperrus Martin, and Maia Marcelo de Almeida. 2018. Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering. IEEE, 130140. Google ScholarGoogle ScholarCross RefCross Ref
  73. [73] Srivastava Nitish, Hinton Geoffrey E., Krizhevsky Alex, Sutskever Ilya, and Salakhutdinov Ruslan. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15 (2014), 19291958.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Tan Shin Hwei, Yoshida Hiroaki, Prasad Mukul R., and Roychoudhury Abhik. 2016. Anti-patterns in search-based program repair. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 727738.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. [75] Tenenbaum Joshua B. and Freeman William T.. 2000. Separating style and content with bilinear models. Neural Computation 12, 6 (2000), 12471283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. [76] Tian Haoye, Liu Kui, Kaboré Abdoul Kader, Koyuncu Anil, Li Li, Klein Jacques, and Bissyandé Tegawendé F.. 2020. Evaluating representation learning of code changes for predicting patch correctness in program repair. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. [77] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 59986008.Google ScholarGoogle Scholar
  78. [78] Wang Shangwen, Liu Kui, Lin Bo, Li Li, Klein Jacques, Mao Xiaoguang, and Bissyandé Tegawendé F.. 2021. Beep: Fine-grained Fix Localization by Learning to Predict Buggy Code Elements. arXiv:2111.07739[cs.SE].Google ScholarGoogle Scholar
  79. [79] Wang Shangwen, Wen Ming, Chen Liqian, Yi Xin, and Mao Xiaoguang. 2019. How different is it between machine-generated and developer-provided patches? An empirical study on the correct patches generated by automated program repair techniques. In Proceedings of the 13th International Symposium on Empirical Software Engineering and Measurement. IEEE, 112. Google ScholarGoogle ScholarCross RefCross Ref
  80. [80] Wang Shangwen, Wen Ming, Lin Bo, and Mao Xiaoguang. 2021. Lightweight global and local contexts guided method name recommendation with prior knowledge. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 741753. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. [81] Wang Shangwen, Wen Ming, Lin Bo, Wu Hongjun, Qin Yihao, Zou Deqing, Mao Xiaoguang, and Jin Hai. 2020. Automated patch correctness assessment: How far are we?. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. ACM, 968980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. [82] Wen Ming, Chen Junjie, Wu Rongxin, Hao Dan, and Cheung Shing-Chi. 2018. Context-aware patch generation for better automated program repair. In Proceedings of the 40th International Conference on Software Engineering. ACM, 111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. [83] White Martin, Tufano Michele, Vendome Christopher, and Poshyvanyk Denys. 2016. Deep learning code fragments for code clone detection. In 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 8798.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. [84] Wu Hongjun, Zhang Zhuo, Wang Shangwen, Lei Yan, Lin Bo, Qin Yihao, Zhang Haoyu, and Mao Xiaoguang. 2021. Peculiar: Smart contract vulnerability detection based on crucial data flow graph and pre-training techniques. In 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 378389. Google ScholarGoogle ScholarCross RefCross Ref
  85. [85] Xin Qi and Reiss Steven P.. 2017. Identifying test-suite-overfitted patches through test case generation. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis. ACM, 226236. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. [86] Xin Qi and Reiss Steven P.. 2017. Leveraging syntax-related code for automated program repair. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. 660670. Google ScholarGoogle ScholarCross RefCross Ref
  87. [87] Xiong Yingfei, Liu Xinyuan, Zeng Muhan, Zhang Lu, and Huang Gang. 2018. Identifying patch correctness in test-based program repair. In Proceedings of the 40th International Conference on Software Engineering. ACM, 789799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. [88] Xiong Yingfei, Wang Jie, Yan Runfa, Zhang Jiachen, Han Shi, Huang Gang, and Zhang Lu. 2017. Precise condition synthesis for program repair. In Proceedings of the 39th IEEE/ACM International Conference on Software Engineering. IEEE, 416426. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. [89] Xuan Jifeng, Martinez Matias, Demarco Favio, Clement Maxime, Marcote Sebastian Lamelas, Durieux Thomas, Berre Daniel Le, and Monperrus Martin. 2017. Nopol: Automatic repair of conditional statement bugs in Java programs. IEEE Transactions on Software Engineering 43, 1 (2017), 3455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. [90] Yang Bo and Yang Jinqiu. 2020. Exploring the differences between plausible and correct patches at fine-grained level. In Proceedings of the 2nd International Workshop on Intelligent Bug Fixing. IEEE, 18. Google ScholarGoogle ScholarCross RefCross Ref
  91. [91] Yang Jinqiu, Zhikhartsev Alexey, Liu Yuefei, and Tan Lin. 2017. Better test cases for better automated program repair. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. ACM, 831841. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. [92] Ye He, Gu Jian, Martinez Matias, Durieux Thomas, and Monperrus Martin. 2021. Automated classification of overfitting patches with statically extracted code features. IEEE Transactions on Software Engineering (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. [93] Ye He, Martinez Matias, and Monperrus Martin. 2021. Automated patch assessment for program repair at scale. Empirical Software Engineering 26, 2 (2021), 138.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. [94] Yi Jooyong, Tan Shin Hwei, Mechtaev Sergey, Böhme Marcel, and Roychoudhury Abhik. 2018. A correlation study between automated program repair and test-suite metrics. Empirical Software Engineering 23, 5 (2018), 29482979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. [95] Yin Pengcheng, Neubig Graham, Allamanis Miltiadis, Brockschmidt Marc, and Gaunt Alexander L.. 2019. Learning to represent edits. In International Conference on Learning Representations. https://openreview.net/forum?id=BJl6AjC5F7.Google ScholarGoogle Scholar
  96. [96] Yu Zhongxing, Martinez Matias, Danglot Benjamin, Durieux Thomas, and Monperrus Martin. 2019. Alleviating patch overfitting with automatic test generation: A study of feasibility and effectiveness for the Nopol repair system. Empirical Software Engineering 24, 1 (2019), 3367. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. [97] Yuan Yuan and Banzhaf Wolfgang. 2018. ARJA: Automated repair of Java programs via multi-objective genetic programming. IEEE Transactions on Software Engineering (2018). Google ScholarGoogle ScholarCross RefCross Ref
  98. [98] Zhang Jian, Wang Xu, Zhang Hongyu, Sun Hailong, Wang Kaixuan, and Liu Xudong. 2019. A novel neural source code representation based on abstract syntax tree. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 783794.Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. [99] Zhang Xiangyu, Gupta Neelam, and Gupta Rajiv. 2006. Pruning dynamic slices with confidence. In 2006 ACM SIGPLAN Conference on Programming Language Design and Implementation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. [100] Zhang Zhuo, Lei Yan, Mao Xiaoguang, and Li Panpan. 2019. CNN-FL: An effective approach for localizing faults using convolutional neural networks. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 445455.Google ScholarGoogle ScholarCross RefCross Ref
  101. [101] Zhao Gang and Huang Jeff. 2018. DeepSim: Deep learning code functional similarity. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 141151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. [102] Zou Daming, Liang Jingjing, Xiong Yingfei, Ernst Michael D., and Zhang Lu. 2019. An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Context-Aware Code Change Embedding for Better Patch Correctness Assessment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Software Engineering and Methodology
      ACM Transactions on Software Engineering and Methodology  Volume 31, Issue 3
      July 2022
      912 pages
      ISSN:1049-331X
      EISSN:1557-7392
      DOI:10.1145/3514181
      • Editor:
      • Mauro Pezzè
      Issue’s Table of Contents

      Copyright © 2022 Association for Computing Machinery.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 May 2022
      • Revised: 1 December 2021
      • Accepted: 1 December 2021
      • Received: 1 July 2021
      Published in tosem Volume 31, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format