[Libre-soc-dev] [RFC] Matrix and DCT/FFT SVP64 REMAP
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Jul 5 04:17:03 BST 2021
On 7/5/21, Jacob Lifshay <programmerjake at gmai
> that's even worse, since now you need a whole temporary matrix
nonsense, it it not "worse", it has been the entire goal of REMAP
since its inception.
a set of registers into which the entire in-place matrix is computed.
that has always been the goal of this exercise.
you may have a misunderstanding of the definition of "in-place".
in this case it also means, "zero register spill".
the entire matrix is computed:
1) in place
2) with no register copies
3) with no LOAD/STOREs (other than those whixh might be required to
load or store the input or output)
4) with a single instruction (plus schedule)
that is the goal and it has been achieved.
any other goals involving partial load / stores and register spill are
out of scope for the purposes of this current discussion, as they are
a lot more complex i.e. have had entire PhDs and academic papers
dedicated to them.
(efficient algorithms for insufficient resources for matrix
multiplication was one of the exercises we did at Imperial).
now, it would be fantastic to cover partial matrix computation
involving LOAD/STORE however that is too much to include in the
current scope: i know how complex it is and unless someone else comes
forward with the algorithms and does the work i don't want to add it
to the already massive list of work that i have.
More information about the Libre-soc-dev