IMDEA Software researchers Facundo Molina, Juan Manuel Copia and Alessandra Gorla have presented FIXCHECK, a new approach to improve patch remediation analysis that combines static analysis, randomized testing and large-scale language models. Their innovation, summarized in the paper “Improving Patch Correctness Analysis with Randomized Testing and Large-Scale Language Models,” was presented at the International Conference on Software Testing, Verification and Validation (ICST 2024).
context
Producing patches to fix software defects is an important task in maintaining a software system. Software defects are typically reported through test cases, which reveal undesirable behavior of the software.
In response to these defects, developers must write patches that must be validated before being committed to the code base, ensuring that the tests provided do not expose the defects. However, the patch may not effectively address the underlying bug or may introduce new bugs, resulting in what is known as a bad fix or bad patch.
If these erroneous patches are detected, it can have a significant impact on the time and effort spent by developers on bug fixing, as well as the overall maintenance of the software system.
the study
Automatic Program Repair (APR) provides software developers with tools that can automatically generate patches for buggy programs, but using these tools often results in the discovery of many erroneous patches that fail to fix the bugs.
To address this issue, researchers at IMDEA Software developed FIXCHECK, a new approach to improve the output of patch correctness analysis. This approach combines static analysis, random testing, and large-scale language models (LLMs) to automatically generate tests that detect bugs in potentially erroneous patches. FIXCHECK employs a two-step process. In the first step, we generate random tests to obtain a large number of test cases. In the second step, we use large-scale language models to derive meaningful assertions for each test case.
In addition, FIXCHECK includes a selection and prioritization mechanism that runs new test cases against the patched program and discards or ranks these tests based on their likelihood of revealing bugs in the patch.
“The effectiveness of FIXCHECK in generating test cases that reveal bugs in erroneous patches was evaluated on 160 patches, including both developer-created patches and patches generated by RPA tools,” said Facundo Molina, postdoctoral researcher at the Institute IMDEA Software.
Results show that FIXCHECK can effectively generate bug-detecting tests with high confidence for 62% of developer-created erroneous patches. Furthermore, it complements existing patch remediation evaluation techniques by providing test cases that reveal bugs in up to 50% of erroneous patches identified by state-of-the-art techniques.
FIXCHECK represents a major advancement in the field of software repair and maintenance by providing a robust solution for automating test generation and detecting defects during software maintenance. This approach not only improves the effectiveness of patch verification but also fosters broader adoption of automated program repair methods.
**This research was funded by the Madrid Regional Government Program S2018/TCS-4339 (BLOQUES-CM) and the Spanish Government MCIN/AEI/10.13039/501100011033/ERDF grants TED2021-132464B-I00 (PRODIGY) and PID2022-142290OB-I00 (ESPADA). These projects are co-funded by the European Union ESF, EIE, and NextGeneration funds.
/Public Release. This material from the originating organization/author may be out of date and has been edited for clarity, style and length. Mirage.News does not take any organizational stance or position and all views, positions and conclusions expressed here are solely those of the authors. Read the full article here.
Source link