Person: Holloway, Glenn
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication HELIX: Automatic Parallelization of Irregular Programs for Chip Multiprocessing.
(Association for Computing Machinery, 2012) Campanoni, Simone; Jones, Timothy; Holloway, Glenn; Reddi, Vijay Janapa; Wei, Gu-Yeon; Brooks, DavidWe describe and evaluate HELIX, a new technique for automatic loop parallelization that assigns successive iterations of a loop to separate threads. We show that the inter-thread communication costs forced by loop-carried data dependences can be mitigated by code optimization, by using an effective heuristic for selecting loops to parallelize, and by using helper threads to prefetch synchronization signals. We have implemented HELIX as part of an optimizing compiler framework that automatically selects and parallelizes loops from general sequential programs. The framework uses an analytical model of loop speedups, combined with profile data, to choose loops to parallelize. On a six-core Intel® Core i7-980X, HELIX achieves speedups averaging 2.25 x, with a maximum of 4.12x, for thirteen C benchmarks from SPEC CPU2000.
Publication Helix: Making the Extraction of Thread-Level Parallelism Mainstream
(Institute of Electrical & Electronics Engineers (IEEE), 2012) Campanoni, Simone; Jones, Timothy; Holloway, Glenn; Wei, Gu-Yeon; Brooks, DavidImproving system performance increasingly depends on exploiting microprocessor parallelism, yet mainstream compilers still don't parallelize code automatically. Helix automatically parallelizes general-purpose programs without requiring any special hardware; avoids slowing down compiled programs, making it a suitable candidate for mainstream compilers; and outperforms the most similar historical technique that has been implemented in production compilers.
Publication The HELIX project
(IEEE, 2012) Campanoni, Simone; Jones, Timothy; Holloway, Glenn; Wei, Gu-Yeon; Brooks, DavidParallelism has become the primary way to maximize processor performance and power efficiency. But because creating parallel programs by hand is difficult and prone to error, there is an urgent need for automatic ways of transforming conventional programs to exploit modern multicore systems. The HELIX compiler transformation is one such technique that has proven effective at parallelizing individual sequential programs automatically for a real six-core processor. We describe that transformation in the context of the broader HELIX research project, which aims to optimize the throughput of a multicore processor by coordinated changes in its architecture, its compiler, and its operating system. The goal is to make automatic parallelization mainstream in multiprogramming settings through adaptive algorithms for extracting and tuning thread-level parallelism.
Publication Quality and speed in linear-scan register allocation
(1997) Traub, Omri; Holloway, Glenn; Smith, MichaelA linear-scan algorithm directs the global allocation of register candidates to registers based on a simple linear sweep over the program being compiled. This approach to register allocation makes sense for systems, such as those for dynamic compilation, where compilation speed is important. In contrast, most commercial and research optimizing compilers rely on a graph-coloring approach to global register allocation. In this paper, we compare the performance of a linear-scan method against a modern graph-coloring method. We implement both register allocators within the Machine SUIF extension of the Stanford SUIF compiler system. Experimental results show that linear scan is much faster than coloring on benchmarks with large numbers of register candidates. We also describe improvements to the linear-scan approach that do not change its linear character, but allow it to produce code of a quality near to that produced by graph coloring.