Publication: Automated Translation: Generating a Code Generator
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
A key problem in retargeting a compiler is to map the compiler's intermediate representation to the target machine's instruction set. One method to write such a mapping is to use grammar-like rules to relate a tree-based intermediate representation with an instruction set. A dynamic-programming algorithm finds the least costly instructions to cover a given tree. Work in this family includes BURG, BEG, and twig. The other method, utilized by gcc and VPO, uses a hand-written "code expander" which expands intermediate representation into naive code. The naive code is improved via machine-independent optimizations while maintaining it as a sequence of machine instructions. Because they are inextricably linked to a compiler's intermediate representation, neither of these mappings can be reused for anything other than retargeting one specific compiler. λ-RTL is a language for specifying the semantics of an instruction set independent of any particular intermediate representation. We analyze the properties of a machine from its λ-RTL description, then automatically derive the necessary mapping to target architecture. By separating such analysis from compilers' intermediate representations, λ-RTL in conjunction with our work allows a single machine description to be used to build multiple compilers, along with other tools such as debuggers or emulators. Our analysis categorizes a machine's storage locations as special registers, general-purpose registers, or memory. We construct a data-movement graph by determining the most efficient way to move arbitrary values between locations. We use this information at compile time to determine which temporary locations to use for intermediate results of large computations. To derive a mapping from an intermediate representation to a target machine, we first assume a compiler-dependent translation from the intermediate representation to register-transfer lists. We discover at compile-compile time how to translate these register-transfer lists to machine code and also which register-transfer lists we can translate. To do this, we observe that values are either constants, fetched from locations, or the results of applying operators to values. Our data-movement graph cover constants and fetched values, while operators require an appropriate instruction to perform the effect of the operator. We search through an instruction set discovering instructions to implement operators via the use of algebraic identities, inverses, and rewrite laws and the introduction of unwanted side effects.