Parallel Linear Algebra

The objective of this project is broadly to investigate parallel linear algebra algorithms chiefly for dense matrices and also provide usable, high performance dense linear algebra codes for the AP series of multicomputers.

The Distributed BLAS (DBLAS) library, originally developed for the AP1000 and AP1000+, has been ported and tuned to several other platforms, including the AP3000. It is also extended to use MPI directly and to cover complex precision data types. It has been highly optimized for all aspects of performance, including reduced software overheads. Other work on libraries include the production of double and complex precision UltraSPARC BLAS, the implementation of the Stride VPPLib messaging library and an enhanced BLACS library for the AP3000 under APRuntime version 2. See the Distributed BLAS Homepage for more details, including portable software releases.

The Distributed BLAS library has been used to implement LU, LLT, LDLT and QR factorization algorithms, using advanced load balancing techniques such as algorithmic blocking, lookahead and panel scattering. All of these techniques have yielded significant performance gains over traditional methods (storage blocking), even on platforms where communication costs are relatively high, such as the AP3000 and the Intel Paragon. See the Extend ScaLAPACK Homepage for how these techniquies have been applied to improve ScaLAPACK performance.

Parallel LDLT factorization codes, based on the DBLAS, have been incorporated into Fujitsu's ACCUFIELD electromagnetic compatibility application (Sep 1999).

Other work on linear algebra includes the development of a sparse direct parallel solver for the Fujitsu VPP-300 and the AP3000.

Recent publications from this project can be found on Peter.Strazdins publications page.

Research Group

Peter E. Strazdins (Project Leader)
Bing B. Zhou
Jeremy Dawson
Viet Nguyen



Peter Strazdins
1999-12-16