[Libre-soc-isa] [Bug 697] SVP64 Reduce Modes

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Wed Mar 23 18:10:46 GMT 2022


https://bugs.libre-soc.org/show_bug.cgi?id=697

--- Comment #23 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #22)
> (In reply to Jacob Lifshay from comment #21)
> > imho moving is absolutely necessary if you want to have tree-reductions run
> > quickly without needing a full-crossbar somewhere on the ALU output -> other
> > ALU input path.
> 
> not the ISA-design-level's problem and therefore invalid.
> 
> > here, i define running quickly to mean not needing to delay extra clock
> > cycles because you're running register values through a separate
> > slow-but-general lane-crossing mechanism.
> 
> this is an architectural designers decision that should in absolutely
> no way cause us with "ISA Design Hats" on to base our decisions on.
> 
> > basically, a tree-reduction only needs to move data in a few set inter-lane
> > paths, but if you skip moving, then you need to be able to read from any
> > high-index element when combining with lower index elements.
> 
> you are conflating architectural internals with the responsibility of ISA
> designers.
> 
> the architectural designer may choose to spot the patterns of ADDs (or
> whatever)
> and insert the prerequisite MVs as micro-ops *at their discretion*
> 
> a FSM architecture (TestIssuer) has no problems of any kind
> 
> another architectural designer may have 12R8W register files
> and therefore have no lanes of any kind.
> 
> another designer may have cyclic shift buffers which substitute for
> full crossbars.
> 
> etc etc etc etc.
> 
> none of these architectural decisions have anything *at all* to do with
> ISA-level design except inasmuch that they all have to be considered
> (and to some extent "implementation advice hints" given)

well, your argument for *not* having moves is based off of mostly the same
things iirc:
having alus also be able to do moves is *also* only a microarchitectural
decision.

> 
> second, bear in mind, that the schedule of ops is entirely deterministic
> based on having read the predicate mask.  ops may be issued (scheduled)
> and analysed long before actual execution takes place.

that's true, it applies to the case with moves too, so isn't a good argument to
choose either moves or no-moves.

> 
> no MVs.

having moves allows a much simpler algorithm imho, as well as having a
consistent place where the reduction result goes (element 0, rather than
whatever random spot the remap algorithm happens to leave it).

I can work out the HDL details if you like. Would creating a fsm that creates
the tree-reduction ops be sufficient for now?

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libre-SOC-ISA mailing list