Application of local semantic analysis in fault prediction and detection

Shao, Danhua

Application of local semantic analysis in fault prediction and detection

Date

2010-05

Authors

Shao, Danhua

Abstract

To improve quality of software systems, change-based fault prediction and scope-bounded checking have been used to predict or detect faults during software development. In fault prediction, changes to program source code, such as added lines or deleted lines, are used to predict potential faults. In fault detection, scope-bounded checking of programs is an effective technique for finding subtle faults. The central idea is to check all program executions up to a given bound. The technique takes two basic forms: scope-bounded static checking, where all bounded executions of a program are transformed into a formula that represents the violation of a correctness property and any solution to the formula represents a counterexample; or scope-bounded testing where a program is tested against all (small) inputs up to a given bound on the input size. Although the accuracies of change-based fault prediction and scope-bounded checking have been evaluated with experiments, both of them have effectiveness and efficiency limitations. Previous change-based fault predictions only consider the code modified by a change while ignoring the code impacted by a change. Scope-bounded testing only concerns the correctness specifications, and the internal structure of a program is ignored. Although scope-bounded static checking considers the internal structure of programs, formulae translated from structurally complex programs might choke the backend analyzer and fail to give a result within a reasonable time. To improve effectiveness and efficiency of these approaches, we introduce local semantic analysis into change-based fault prediction and scope-bounded checking. We use data-flow analysis to disclose internal dependencies within a program. Based on these dependencies, we identify code segments impacted by a change and apply fault prediction metrics on impacted code. Empirical studies with real data showed that semantic analysis is effective and efficient in predicting faults in large-size changes or short-interval changes. While generating inputs for scope-bounded testing, we use control-flow to guide test generation so that code coverage can be achieved with minimal tests. To increase the scalability of scope-bounded checking, we split a bounded program into smaller sub-programs according to data-flow and control-flow analysis. Thus the problem of scope-bounded checking for the given program reduces to several sub-problems, where each sub-problem requires the constraint solver to check a less complex formula, thereby likely reducing the solver’s overall workload. Experimental results show that our approach provides significant speed-ups over the traditional approach.