[Libre-soc-isa] [Bug 697] SVP64 Reduce Modes
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Thu Mar 24 00:21:48 GMT 2022
https://bugs.libre-soc.org/show_bug.cgi?id=697
--- Comment #24 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #23)
> well, your argument for *not* having moves is based off of mostly the same
> things iirc:
> having alus also be able to do moves is *also* only a microarchitectural
> decision.
ALUs never do MVs. ok very basic ones would. usually a pseudoop add rt ra 0
> >
> > second, bear in mind, that the schedule of ops is entirely deterministic
> > based on having read the predicate mask. ops may be issued (scheduled)
> > and analysed long before actual execution takes place.
>
> that's true, it applies to the case with moves too, so isn't a good argument
> to choose either moves or no-moves.
when combined with Vertical First mode it will melt people's brains,
and also complicate ExtraV coherence to have the decision in the
separated logic to suddenly do a 2op MV rather than a 3op or 4op
ADD/MUL/whatever.
too much.
> > no MVs.
>
> having moves allows a much simpler algorithm imho,
probably, but where's the fun in that? :)
also retrospectively i found that DCT REMAP
turned out to use Gray Coding. that was...
unexpected and beautiful, and i would never
have noticed the simplicity of it if i had
thought "lets take the simple route"
> as well as having a
> consistent place where the reduction result goes (element 0, rather than
> whatever random spot the remap algorithm happens to leave it).
i know. i worked through the caveats: only a single element (all others
predicated out) would be the one that remained invalid.
even 2 elements should / would target the correct result-indexed destination
element regardless of the positions of 2 active bits of predicate.
> I can work out the HDL details if you like. Would creating a fsm that
> creates the tree-reduction ops be sufficient for now?
given that this will end up being a type of REMAP
can i recommend following the path i did there with
MMATRIX and FFT DCT which was:
* a braindead obvious python demo algorithm
* conversion to a yield generator which purely
returns indices that
* get blatted on top of a scalar op then
* integrate the yield generator into ISACaller
* and then implement the HDL FSM
yes it is a lot of work, it is why MM FFT DCT took me
about 8 weeks or more.
ISACaller integration will be essential so might as well
do incremental.
the first priority would therefore be to do the braindead
demo. i think the first version i did for MATRIX REMAP
didnt even do the mmult itself, just printed out the
indices.
also i would like to see what the algorithm generates
(the indices) to see if it is in fact workable.
DCT/FFT REMAP has caveats: power-of-two.
no point spending time doing HDL FSM if the algorithm turns
out to be borked. need to find that out early.
will find links later
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list