[Libre-soc-dev] [RFC] SVP64 Vertical-First Mode loops
luke.leighton at gmail.com
Sun Aug 22 12:10:32 BST 2021
On August 21, 2021 10:52:01 PM UTC, Richard Wilbur
<richard.wilbur at gmail.com> wrote:
>On Sat, Aug 21, 2021 at 3:42 PM lkcl <luke.leighton at gmail.com> wrote:
>> On August 21, 2021 9:30:21 PM UTC, Richard Wilbur
><richard.wilbur at gmail.com> wrote:
>> >(The hard result cache needn’t be tied specifically to REMAP, it
>> >be used by normal vector or scalar code.)
>> ya know... another name for "fast small hard result cache" is
> "fast", yes. "small", not necessarily.
which would need explaining to the ISA WG, "why we are duplicating the
functionality of a register file including adding explicit instructions which
are to transfer between the new type of register file and the standard GPR/FPR"
also if it is particularly large you run into latency issues.
>> everything you described has the identical properties of a register
>That's sort of what we want but don't have space in the instruction
>format for the bits to specify the register numbers, right?
correct. and don't want to (a) modify v3.0B or (b) go retrospectively
back and alter the SVP64 RM field.
> So I see
>this as an opportunity to create an algorithm-specific method of
>addressing the new "registers".
which in turn requires a means and method of actually accessing
those new registers.
> Another advantage of this scheme is
>that it is never in need of saving and restoring with a context
this isn't true: i can foresee circumstances where two proceses will need
to use different constants.
honestly richard although at first glance it seems like a good idea,
it's really no different from "A Register File".
plus, really, a way is needed for *all* instructions to read from
"The Registers/Cache" not just one or two, because if it's just
one ("move from one register/cache to the GPR/FPR") then
that's one extra instruction inside inner loops
and if it's merged into a "specialist" instruction (DCT coefficient
multiply) we just caused what was previously a potentially
useful generic twin mul-add instruction to become a non-generic
all these things need to be thought through - in full - unfortunately,
when it comes to ISA design. then, when you've spent several
days/weeks outlining the entire lot, you then have to spend several
more days/weeks making a comparative analysis against *existing*
part of that analysis involves
* "what's the cost of implementing this" as well as
* "what's the cost to CHANGE an EXISTING implementation" and
* "how much work is it to create a Conformance Validation Test Suite" and
* "what will the ISA WG think about this proposal, what will they ask"
More information about the Libre-soc-dev