[Libre-soc-isa] [Bug 1056] questions and feedback (v2) on OPF RFC ls010

Thu Jun 8 05:24:21 BST 2023

https://bugs.libre-soc.org/show_bug.cgi?id=1056

--- Comment #69 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Paul Mackerras from comment #58)
> (In reply to Luke Kenneth Casson Leighton from comment #51)
> > (In reply to Paul Mackerras from comment #41)
> > 
> > > I haven't seen a clear and unambiguous answer as to whether
> > > that is true or not. (You do seem to say it is true below, except that each
> > > such statement seems to have some sort of caveat on it.)
> > 
> > it is.  an Embedded Finite State Machine (and Libre-SOC's TestIssuer
> > does this) would:
> > 
> > * read the PO9-word
> > * cache the 24-bit RM area and prohibit interrupts
> > * read the next 32-bit word
> > * throw {24}{32} at decode+issue+execute and re-enable interrupts
> > 
> > and that is an important Micro-Architecture to have (minimum resources).
> > 
> > i considered at some point having an actual SPR to store the state in
> > between the two (the PO9-word and Defined-word-instruction) but i feel
> > it is a tiny bit overkill.
> 
> I think there is an important point here about how Simple-V is understood
> and explained, which I hope I can get you to reconsider. You (Luke) seem
> adamant that a vectorized instruction is not to be thought of as a prefixed
> instruction, whereas it seems to me that explaining it as a prefixed
> instruction has a lot of advantages and is the natural way to think about it.
> 
> The reasons for thinking of the instruction word containing the
> vectorization parameters as a prefix, and the following instruction word as
> a suffix, are:
> 
> * The meaning and interpretation of the first word depend on the second. So
> the first word is not an independent instruction in its own right. If the
> first word were an instruction in its own right then you would be able to
> tell me exactly what it does without reference to any following instruction.
> 
> * The meaning and interpretation of the second word do in fact get changed
> by the first word. The first word can change which registers are accessed,
> which parts of the registers, and other aspects such as what type of
> addition is performed (saturating vs. two's-complement).
> 
> * You can't allow any interrupt, not even machine check or system reset,
> between the execution of the first word and the execution of the second. In
> other words you can't really let the first word "instruction" complete until
> the second word has done its work.
> 
> * If you started to execute the function identified by the two words
> together and then wanted to take an interrupt before you were finished, you
> would need to set SRR0 to point to the first word, not the second. But if
> you think of the two words as two separate instructions, you would naturally
> set SRR0 to point to the second word, which would be wrong.
> 
> * If the two words are really two separate instructions, then you don't
> really have any grounds for prohibiting certain instructions as the second
> word.
> 
> All of that says to me that the pair of words looks like a prefixed
> instruction and acts like a prefixed instruction. So why not just call it
> one?

i was looking for the right phraseology, to protect the Decode
Phase from having arbitrary opcodes dropped into it, and also
to ensure Orthogonality (not the same as "high performance",
see comment #64)

basically: agreed, yes, it is much better to call it a prefix.

i managed to work it out: it is actually the RTL and
the operands that must not change (nor the Reserved areas).
instructions *have* to be added (even if performance sucks)
to both the prefixed and unprefixed area with the exact same
RTL and operands, or not at all. even Unvectorized ones.

if there are any exceptions to that they must go through the
same Compliancy Level optional groupings as the Scalar set,
for exactly the same reasons that the Compliancy Subsets exist:
to avoid software chaos.

> > the problem with tagging is that it becomes part of the Architectural
> > State (an SPR or in this case *group* of SPRs), which massively
> > complexifies simulators debuggers etc.
> > but also context-switch becomes absolute hell.
> 
> Right.
> 
> Thinking about this in a way that requires some kind of state to be set by
> the execution of the first word which then affects what the second word
> does, really does seem to me to add complexity and not aid comprehension.

it is a variant of ARM SVE's MOVPRFX
https://www.google.com/search?q=movprfx+sve

but there they use actual GPRs as the "intermediate state" that
may be elided in high-performance designs.  aka "macro-op fused",
the defining characteristic of which is that the output from
instruction (1) is both the source *and destination* of
instruction (2).

(in SVP64 it would be the 24 RM bits that is written by the
1st "instruction" - prefix - then read in the "second" - suffix -
and overwritten to zero again in the second. and to an SPR not
a GPR).

> Any kind of hidden state tends to create difficulties (I think the only
> hidden state we have in the ISA at the moment is the reservation), and if
> you expose that state then you make it possible to be set by other means,
> which opens another whole can of worms.

macro-op fusion is pretty common in RISC ISA implementations, because
of the inefficiency involved in running the individual instructions.
fusing makes one Write-after-Write, one Read-after-Write *and* one
Write-after-Read all disappear, and one (internal, micro-coded)
Function Unit and Reservation Station is needed instead of two.
even an in-order system benefits because there is less in the
pipelines (one op not 2).

in general having the state saveable greatly decreases implementation
complexity, and SVP64 is no different in that regard.

> Having an actual SPR to expose that intermediate state would be a bad idea
> in my opinion because it would destroy the property of being able to tell
> what registers are read and written just from the instruction word(s). I
> realize that SV already lacks that property for any vectorized instruction,
> but making that intermediate state explicit, and settable by other means
> (e.g. mtspr) would destroy that property for any instruction with register
> operands (i.e. almost all of them).

true, but what really clinched it for me - why i didn't add it - is that
it is yet another SPR to save/restore.  i wanted SV to be only one
SPR (okok REMAP aside).  when SVSTATE was 32-bit in an early draft
it would have been possible to save the RM 24-bits. by the time the
dust settled only 4 bits remain spare in SVSTATE.

-- 
You are receiving this mail because:
You are on the CC list for the bug.