Removing unimportant computations in interprocedural program analysis

Tok, Teck Bok, 1973-

Removing unimportant computations in interprocedural program analysis

Date

2007-12

Authors

Tok, Teck Bok, 1973-

Abstract

Existing solutions to program analysis problems often trade between scalability and precision. We propose new techniques to improve scalability of interprocedural high-precision program analyses without sacrificing precision. The key insight is that there is a high volume of unimportant computations in these analyses. Our way to improve analysis time is by identifying and reducing these computations. We identify unimportant work in three classes of program analyses: flowsensitive dataflow analysis, context-sensitive dataflow analysis, and reachabilitybased analysis. We propose new algorithms to reduce this inefficiency. We are interested in analysis problems and programs that involve pointers, because they are used in many modern languages. Their presence sometimes complicates the analysis problems, and they sometimes raise the problem size due to large points-to sets in the program. Our solutions include an improved worklist management algorithm used in flow-sensitive interprocedural analysis that drastically reduces the amount of work placed on the worklist. We introduce Relevance-Based Context Partitioning, a new algorithm that groups contexts together in a way such that only important information will be computed precisely, and that the number of contexts to analyze is considerably smaller. For the class of reachability-based analysis, the way to improve scalability is to reduce the graph used in the analysis. The nodes in the graph represent dataflow facts, while the edges represent flow functions. The analysis problem is then reduced to a reachability problem. Compared to existing efficient algorithms, we present two new algorithms that significantly reduce the size of the graphs. The first idea uses the graph to represent data dependences instead of the more indirect control dependences that are typically used. The second idea identifies nodes that can be eliminated, because they do not contribute to solving the analysis problem. We evaluate our ideas by applying our algorithms to five error-checking analyses and by comparing results against state-of-the-art algorithms, using a large suite of 19 open-source programs. We are able to improve performance significantly. Our new algorithms for flow-sensitive analysis achieve approximately 2× speedup, on average. For context-sensitive analysis, our technique sometimes also allows the new algorithm to solve cases that previously could not be solved due to run out of memory. For other cases, the average speedup is 7.0× and can be more than 200× in some cases. Finally, our new algorithms for reachability-based analysis can produce graphs that are just 6% the size of the original graphs. Consequently, more benchmarks can be analyzed successfully, and the average analysis time speedup is as high as 5.5× faster.