The preliminary tests with matrix-matrix multiplication show that the C-XSC library needs to be optimized in several ways to be efficient in a high performance environment (up to now the main goal of C-XSC was functionality and portability, not speed)....
similarity transformation preserves the spectrum, only the (n-1)\times (n-1) principal finite section of X_Q may be recovered by the n\times n principal finite sections of R and X_P—an observation of Kautsky and Golub [30] that follows from triangular and tridiagonal matrix multiplication...
The matrix form uses a matrix-matrix multiplication that is very fast on modern hardware, rather than multiple matrix-vector multiplications that would be inevitably slower. Using these methods, ridge regression can be solved efficiently on large numbers of voxels. Efficient solver for multiple hyper...
factorizationofabandedToeplitzmatrixoflargeorderandrelativelysmallbandwidth.WeinvestigatetheasymptoticbehaviorofthemaindiagonalentriesoftheRfactorwhenthematrixdimensiongoestoinfinity.©2005ElsevierInc.Allrightsreserved.AMSclassification:47B35Keywords:Toeplitzdeterminant;BandedToeplitzmatrix;QRfactorization1.Introduction...
of EP0314968 A multi-bit overlapped scanning multiplication system assembles modified partial products in a reduced, non-rectangular banded matrix. The rows of the matrix except for the first and last, are extended with bands of encoded extensions of limited length at the right and left ends ...
A multi-bit overlapped scanning multiplication system assembles modified partial products in a reduced, non-rectangular banded matrix. The rows of the matrix except for the first and last, are extended with bands of encoded extensions of limited length at the right and left ends of the partial ...
This matches the FFT count for a triangular Toeplitz-matrix-vector multiplication. Clearly then, our method improves on the many algebraic FFT based processes that first determine the triangular Toeplitz matrix that is the inverse of a BDDTT matrix in order to then solve the associated BDDTT ...
The dgbmvdgbmv performs general banded matrix-vector multiplication for double precision, is the most basic Level-2 operation in BLAS. First, we designed a reproducible algorithm for banded matrix-vector multiplication repro_dgbmvrepro_dgbmv based on the technique of error-free transformation. Then...
The dgbmvdgbmv performs general banded matrix-vector multiplication for double precision, is the most basic Level-2 operation in BLAS. First, we designed a reproducible algorithm for banded matrix-vector multiplication repro_dgbmvrepro_dgbmv based on the technique of error-free transformation. Then...
The main tool that PBSDC uses is a parallel structured matrix multiplication algorithm (PSMMA), which can be much faster than the general dense matrix multiplication ScaLAPACK routine PDGEMM. Numerous experiments have been performed on Tianhe-2 supercomputer to compare PBSDC with PBDC and ELPA. ...