Language and tool support for multilingual programs



Journal Title

Journal ISSN

Volume Title



Programmers compose programs in multiple languages to combine the advantages of innovations in new high-level programming languages with decades of engineering effort in legacy libraries and systems. For language inter-operation, language designers provide two classes of multilingual programming interfaces: (1) foreign function interfaces and (2) code generation interfaces. These interfaces embody the semantic mismatch for developers and multilingual systems builders. Their programming rules are difficult or impossible to verify. As a direct consequence, multilingual programs are full of bugs at interface boundaries, and debuggers cannot assist developers across these lines.

This dissertation shows how to use composition of single language systems and interposition to improve the safety of multilingual programs. Our compositional approach is scalable by construction because it does not require any changes to single-language systems, and it leverages their engineering efforts. We show it is effective by composing a variety of multilingual tools that help programmers eliminate bugs. We present the first concise taxonomy and formal description of multilingual programming interfaces and their programming rules. We next compose three classes of multilingual tools: (1) Dynamic bug checkers for foreign function interfaces. We demonstrate a new approach for automatically generating a dynamic bug checker by interposing on foreign function interfaces, and we show that it finds bugs in real-world applications including Eclipse, Subversion, and Java Gnome. (2) Multilingual debuggers for foreign function interfaces. We introduce an intermediate agent that wraps all the methods and functions at language boundaries. This intermediate agent is sufficient to build all the essential debugging features used in single-language debuggers. (3) Safe macros for code generation interfaces. We design a safe macro language, called Marco, that generates programs in any language and demonstrate it by implementing checkers for SQL and C++ generators. To check the correctness of the generated programs, Marco queries single-language compilers and interpreters through code generation interfaces. Using their error messages, Marco points out the errors in program generators.

In summary, this dissertation presents the first concise taxonomy and formal specification of multilingual interfaces and, based on this taxonomy, shows how to compose multilingual tools to improve safety in multilingual programs. Our results show that our compositional approach is scalable and effective for improving safety in real-world multilingual programs.