Menu

The Strength of Random Search on Automated Program Repair

This site presents repair tools and experiment data used in our ICSE 2014 paper.

Introduction

Automated program repair received considerable recent attention, and many techniques on this research area have been proposed. Among these techniques, two genetic-programming-based techniques, GenProg and Par, have shown the promising results. In particular, GenProg has been used as the baseline technique to check the repair effectiveness of new techniques in much literature. Although GenProg and Par have shown their strong ability of fixing real-life bugs in nontrivial programs, to what extent GenProg and Par can benefit from genetic programming, used by them to guide the patch search process, is still unknown.

To address the question, in this paper we present a new automated repair technique using random search, which is commonly considered much simpler than genetic programming, and implement a prototype tool called RSRepair. Experiment on 7 programs with 24 versions shipping with real-life bugs suggests that RSRepair, in most cases (23/24), outperforms GenProg in terms of both repair effectiveness (requiring fewer patch trials) and efficiency (requiring fewer test case executions), justifying the stronger strength of random search over genetic programming. According to experimental results, we suggest that every proposed technique using optimisation algorithm should check its effectiveness by comparing it with random search.

RSRepair

According to Algorithm 2 in our paper, we implemented RSRepair by modifying GenProg, which is written in OCaml language. Specifically, RSRepair implements the functions of fault localiztion and matuation in the way like GenProg.

In fact, RSRepair produces candidate patches in the same way of generating randomly patches in the first generation of GenProg but without fitness guidance and crossover in the subsequent generations. For the process of patch validation, RSRepair validates candidate patches in the way described in Algorithm 2 in our paper.

Experiment

To check the performance of RSRepair, we compare it with GenProg, a state-of-the-art tool on automated program repair, on 7 subject programs with 24 faulty versions. We selected GenProg for the reason that GenProg is almost the only state-of-the-art automated repair tool which can fix bugs in real-world, large-scale C programs. Our experimental evaluation seeks to address the following Research Questions:

Results

Result data in our experiment show that:

Tools and Data

We give all the experimental results in our ICSE paper. For each subject program, we separately run RSRepair and GenProg to fix the bugs in 7 nontrivial programs with 24 versions. We record the repair log and data, resulting in some compression files (i.e., tar.bz2 files) named with the corresponding subject programs. For one compression file such as libtiff.tar.bz2, there are severval different folders, each of which includes both the repair log (debug and best folders) and statistic data (*-statistic.txt) corresponding to each program version. In addition, we also present the Matlab script (named process.m) used to analyze the result data in our paper. You can directly run the process.m on Matlab once all the rar file is decompressed.