Performance Modeling of Distributed Memory Architectures
CitationJohnsson, S. Lennart. 1991. Performance Modeling of Distributed Memory Architectures. Harvard Computer Science Group Technical Report TR-10-91.
AbstractWe provide performance models for several primitive operations on data structures distributed over memory units interconnected by a Boolean cube network. In particular, we model single source, and multiple source concurrent broadcasting or reduction, concurrent gather and scatter operations, shifts along several axes of multi-dimensional arrays, and emulation of butterfly networks. We also show how the processor configuration, data aggregation, and the encoding of the address space affect the performance for two important basic computations: the multiplication of arbitrarily shaped matrices, and the Fast Fourier Transform. We also give an example of the performance behavior for local matrix operations for a processor with a single path to local memory, and a set of registers. The analytic models are verified by measurements on the Connection Machine model CM-2.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:24947960
- FAS Scholarly Articles