Show simple item record

dc.contributor.authorJohnsson, S. Lennart
dc.contributor.authorOrtiz, Luis F.
dc.date.accessioned2016-01-21T21:27:02Z
dc.date.issued1992
dc.identifier.citationJohnsson, S. Lennart and Luis F. Ortiz. 1992. Local Basic Linear Algebra Subroutines (LBLAS) for Distributed Memory Architectures and Languages with Array Syntax. Harvard Computer Science Group Technical Report TR-09-92.en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:24829622
dc.description.abstractWe describe a subset of the level-1, level-2, and level-3 BLAS implemented for each node of the Connection Machine system CM-200. The routines, collectively called LBLAS, have interfaces consistent with languages with an array syntax such as Fortran 90. One novel feature, important for distributed memory architectures, is the capability of performing computations on multiple instances of objects in a single call. The number of instances and their allocation across memory units, and the strides for the different axes within the local memories, are derived from an array descriptor that contains type, shape, and data distribution information. Another novel feature of the LBLAS is a selection of loop order for rank{1 updates and matrix-matrix multiplication based upon array shapes, strides, and DRAM page faults. The peak efficiencies for the routines are in excess of 75%. Matrix-vector multiplication achieves a peak efficiency of 92%. The optimization of loop ordering has a success rate exceeding 99.8% for matrices for which the sum of the lengths of the axes is at most 60. The success rate is even higher for all possible matrix shapes. The performance loss when a nonoptimal choice is made is less than ~15% of peak and typically less than 1% of peak. We also show that the performance gain for high rank updates may be as much as a factor of 6 over rank-1 updates.en_US
dc.description.sponsorshipEngineering and Applied Sciencesen_US
dc.language.isoen_USen_US
dash.licenseLAA
dc.titleLocal Basic Linear Algebra Subroutines (LBLAS) for Distributed Memory Architectures and Languages with Array Syntaxen_US
dc.typeResearch Paper or Reporten_US
dc.description.versionVersion of Recorden_US
dc.date.available2016-01-21T21:27:02Z


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record