Browsing by Subject "Program analysis"

Now showing 1 - 6 of 6

Automatic static analysis of software performance
(2016-05) Olivo, Oswaldo Luis; Lin, Yun Calvin; Dillig, Isil; Dillig, Thomas; Lahiri, Shuvendu; Shmatikov, Vitaly
Performance is a critical component of software quality. Software performance can have drastic repercussions on an application, frustrating its users, breaking the functionality of its components, or even rendering it defenseless against hackers. Unfortunately, unlike in the program verification domain, robust analysis techniques for software performance are almost non-existent. In this thesis we formalize important classes of performance-related bugs and security vulnerabilities, and implement novel static analysis techniques for automatically detecting them in widely used open-source applications. Our tools are able to uncover 92 performance bugs and 47 security vulnerabilities, while analyzing hundreds of thousands of lines of code and reporting a modest amount of false positives. Our work opens a new avenue for research: the development of rigorous automatic analyses for effective software performance understanding, inspired by traditional research in functional verification.
Compositional symbolic execution with memoized replay
(2014-05) Qiu, Rui, active 21st century; Khurshid, Sarfraz; Yang, Guowei
Symbolic execution is a powerful, systematic analysis that has received much visibility in the last decade. Scalability however remains a major challenge for symbolic execution. Compositional analysis is a well-known general purpose methodology for increasing scalability. This thesis introduces a new approach for compositional symbolic execution. Our key insight is that we can summarize each analyzed method as a memoization tree that captures the crucial elements of symbolic execution, and leverage these memoization trees to efficiently replay the symbolic execution of the corresponding methods with respect to their calling contexts. Memoization trees offer a natural way to compose in the presence of heap operations, which cannot be dealt with by previous work that uses logical formulas as summaries for composi- tional symbolic execution. Our approach also enables an efficient treatment of error traces by short-circuiting the execution of paths that lead to them. Our preliminary experimental evaluation based on a prototype implementation in Symbolic PathFinder shows promising results.
Integrating programming languages and databases via program analysis and language design
(2009-12) Wiedermann, Benjamin Alan; Cook, William Randall; Batory, Don; Blackburn, Stephen M.; Lin, Calvin; McKinley, Kathryn S.
Researchers and practitioners alike have long sought to integrate programming languages and databases. Today's integration solutions focus on the data-types of the two domains, but today's programs lack transparency. A transparently persistent program operates over all objects in a uniform manner, regardless of whether those objects reside in memory or in a database. Transparency increases modularity and lowers the barrier of adoption in industry. Unfortunately, fully transparent programs perform so poorly that no one writes them. The goal of this dissertation is to increase the performance of these programs to make transparent persistence a viable programming paradigm. This dissertation contributes two novel techniques that integrate programming languages and databases. Our first contribution--called query extraction--is based purely on program analysis. Query extraction analyzes a transparent, object-oriented program that retrieves and filters collections of objects. Some of these objects may be persistent, in which case the program contains implicit queries of persistent data. Our interprocedural program analysis extracts these queries from the program, translates them to explicit queries, and transforms the transparent program into an equivalent one that contains the explicit queries. Query extraction enables programmers to write programs in a familiar, modular style and to rely on the compiler to transform their program into one that performs well. Our second contribution--called RBI-DB+--is an extension of a new programming language construct called a batch block. A batch block provides a syntactic barrier around transparent code. It also provides a latency guarantee: If the batch block compiles, then the code that appears in it requires only one client-server communication trip. Researchers previously have proposed batch blocks for databases. However, batch blocks cannot be modularized or composed, and database batch blocks do not permit programmers to modify persistent data. We extend database batch blocks to address these concerns and formalize the results. Today's technologies integrate the data-types of programming languages and databases, but they discourage programmers from using procedural abstraction. Our contributions restore procedural abstraction's use in enterprise applications, without sacrificing performance. We argue that industry should combine our contributions with data-type integration. The result would be a robust, practical integration of programming languages and databases.
Lifting the Abstraction Level of Compiler Transformations
(2013-08-08) Tang, Xiaolong
Production compilers implement optimizing transformation rules for built-in types. What justifies applying these optimizing rules is the axioms that hold for built-in types and the built-in operations supported by these types. Similar axioms also hold for user-defined types and the operations defined on them, and therefore justify a set of optimization rules that may apply to user-defined types. Production compilers, however, do not attempt to construct and apply these optimization rules to user-defined types. Built-in types together the axioms that apply to them are instances of more general algebraic structures. So are user-defined types and their associated axioms. We use the technique of generic programming, a programming paradigm to design efficient, reusable software libraries, to identify the commonality of classes of types, whether built-in or user-defined, convey the semantics of the classes of types to compilers, design scalable and effective program analysis for them, and eventually apply optimizing rules to the operations on them. In generic programming, algorithms and data structures are defined in terms of such algebraic structures. The same definitions are reused for many types, both built-in and user-defined. This dissertation applies generic programming to compiler analyses and transformations. Analyses and transformations are specified for general algebraic structures, and they apply to all types, both built-in and primitive types.
Systematic techniques for efficiently checking Software Product Lines
(2013-12) Kim, Chang Hwan Peter; Batory, Don S., 1953-; Khurshid, Sarfraz
A Software Product Line (SPL) is a family of related programs, which of each is defined by a combination of features. By developing related programs together, an SPL simultaneously reduces programming effort and satisfies multiple sets of requirements. Testing an SPL efficiently is challenging because a property must be checked for all the programs in the SPL, the number of which can be exponential in the number of features. In this dissertation, we present a suite of complementary static and dynamic techniques for efficient testing and runtime monitoring of SPLs, which can be divided into two categories. The first prunes programs, termed configurations, that are irrelevant to the property being tested. More specifically, for a given test, a static analysis identifies features that can influence the test outcome, so that the test needs to be run only on programs that include these features. A dynamic analysis counterpart also eliminates configurations that do not have to be tested, but does so by checking a simpler property and can be faster and more scalable. In addition, for runtime monitoring, a static analysis identifies configurations that can violate a safety property and only these configurations need to be monitored. When no configurations can be pruned, either by design of the test or due to ineffectiveness of program analyses, runtime similarity between configurations, arising due to design similarity between configurations of a product line, is exploited. In particular, shared execution runs all the configurations together, executing bytecode instructions common to the configurations just once. Deferred execution improves on shared execution by allowing multiple memory locations to be treated as a single memory location, which can increase the amount of sharing for object-oriented programs and for programs using arrays. The techniques have been evaluated and the results demonstrate that the techniques can be effective and can advance the idea that despite the feature combinatorics of an SPL, its structure can be exploited by automated analyses to make testing more efficient.
Traversal, Case Analysis, and Lowering for C++ Program Analysis
(2010-01-14) Wagner, Luke A.
To work effectively, programmers need tools to support their typical development activities, such as the creation, analysis, and transformation of source code. Analysis and transformation tools can be difficult to write for modern programming languages and, without a reusable framework, each tool must separately implement nontrivial algorithms like name lookup and type checking. This thesis describes an extension to one such framework, named Pivot, that focuses on programs written in C++. This extension, named Filter, assists the tool builder in traversal, case analysis, and lowering of the data structure representing C++ programs. Comparisons described in the thesis show a 2-4x code reduction when solving basic problems (e.g., searching for uses of a given declaration) using the extension and a performance overhead that drops below 2x for larger problems (e.g., checking C++ layout compatibility).

Browsing by Subject "Program analysis"

Results Per Page

Sort Options