[Libre-soc-dev] [RFC] Matrix and DCT/FFT SVP64 REMAP

Tue Jul 6 00:39:38 BST 2021

Another way to think of vector/matrix multiplication for 0, 1, or 2 dimensions as a binary operator (two operands, one result) is to consider the dimensionality relation that must hold, namely:
result Target = operand A x operand B

dim(Target) = dim(A x B)
dim(A x B) = dim(A, 1) x dim(B, 2)
If A is an LxM matrix and B is an MxN matrix, then Target will be an LxN matrix.
LxN = LxM x MxN

0.  If L = M = N = 1, this is the original 0-D scalar product in which A, B, and Target are all scalar.  (no loops used in vector unit)
1.  If L = 1, M > 1 then A is a vector.  If N =1, B is also a vector and this is a vector dot product with a scalar (1x1) result.  (one loop used in vector unit)
2.  a.  If L = 1, M > 1, N > 1, then A is a vector and B is a matrix.  This is a product whose result is a 1xN vector.  (two loops used in vector unit)
b.  Likewise if L > 1, M > 1, N = 1, then A is a matrix, B is a vector, and the result is a Lx1 vector.  (two loops used in vector unit)
c.  If L > 1, M = 1, N > 1, then A and B are vectors and the result is a LxN matrix.  (two loops used in vector unit)
3. If L, M, and N are all > 1, then A and B are both matrices, this is a matrix product whose result is also a matrix. (three loops used in vector unit)

When the inner dimension (M in the formulation above) is greater than 1, we need multiply accumulate for that loop.  We can avoid initializing the target registers to zero by simply always starting with a simple multiply for the first iteration of the inner loop.  Then if the inner dimension (size) is greater than 1, for all subsequent iterations of the inner loop do a multiply accumulate.

So, for cases 0 and 2.c above, no accumulate is used.