[Libre-soc-dev] svp64

Sun Dec 20 12:15:06 GMT 2020

so, i apologise jacob, i should have made it clear.  the context is:
we are NOT introducing radical new design concepts at this late stage.
the context is: take the EXISTING work and decisions and get them into
OpenPOWER ISA as fast as possible, because we are at least 8-12 months
behind.

the time for radical new redesign concepts was over 18 months ago.  so
if there are any new ideas, *record* them (because we may have to
revisit), but please, *do not* expect that they should be inserted
into the timeline for evaluation - right now.  we *do not have time*.

to emphasise: again, i have said this a number times: if we want to
jeapordise the project by extending the timeline so far that we no
longer have funding from NLnet, we can proceed down this path.
NLnet's funding *ends* in around 11 (eleven) months time.

inline comments continue below.

On Sun, Dec 20, 2020 at 6:44 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Sat, Dec 19, 2020, 06:01 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> > On Friday, December 18, 2020, Jacob Lifshay <programmerjake at gmail.com>
> > wrote:
> > > On Fri, Dec 18, 2020, 15:35 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> > > wrote:
> > >> * still do not know what the best arrangement for CRs is.
> > >>
> > >
> > > I'm for the arrangement that mirrors the register layout I picked for
> > > FP/Int registers.
> >
> >
> > CR[i] is the notation used by the OpenPower spec to refer to CR field #i,
> > so FP instructions with Rc=1 write to CR[1] aka SVCR1_000.
> >
> >
> > so when vectorisation is enabled CR[2] and onwards are destroyed.
>
>
> nope. only when VL > 8 is CR[2] destroyed.

which CR[2]? how?

> CR registers in VL order (element numbers assuming write starts at CR[1]):
>
> SVCR1_000 aka. CR[1] -- used for element 0
> SVCR1_001 no corresponding CR[...] -- used for element 1
> SVCR1_010 no corresponding CR[...] -- used for element 2
> SVCR1_011 no corresponding CR[...] -- used for element 3
> SVCR1_100 no corresponding CR[...] -- used for element 4
> SVCR1_101 no corresponding CR[...] -- used for element 5
> SVCR1_110 no corresponding CR[...] -- used for element 6
> SVCR1_111 no corresponding CR[...] -- used for element 7
> SVCR2_000 aka. CR[2] -- used for element 8
> SVCR2_001 no corresponding CR[...] -- used for element 9
> SVCR2_010 no corresponding CR[...] -- used for element 10
> SVCR2_011 no corresponding CR[...] -- used for element 11
> SVCR2_100 no corresponding CR[...] -- used for element 12
> SVCR2_101 no corresponding CR[...] -- used for element 13
> SVCR2_110 no corresponding CR[...] -- used for element 14
> SVCR2_111 no corresponding CR[...] -- used for element 15
> SVCR3_000 aka. CR[3] -- used for element 16
>
> Note that I'm thinking Rc=1 vector instructions should always start at
> CR[6] aka. SVCR6_000 instead of CR[0] or CR[1], since that will match where
> the mask starts to be read from when using CRs as mask registers.

i don't understand how the linear relationship exists so i can't
evaluate what you're saying.  the fact that CR[N] no longer has a
linear relationship means that i can't understand what you mean when
you say "CR[2]" or "CR[6]".

and - this is the kicker - *we don't have time to go over it to make
it clear*.  i repeat again: 8 days to only just begin to be able to
ask *questions* about what you're proposing, this is a bad sign, and
we're well past the cut-off point for radical new ideas.

> The hw implementation of what I proposed is utterly simple, just add 2/3
> lsb bits to all reg fields everywhere (including non-SV, they just are
> zeros for non-SVP64 instructions).

ok, so how can the CR[0].. CR[7] be accessed linearly?  they can't,
can they?  is the OpenPOWER names changing?  is CR[6] at position 6?
i can't tell.  is CR[6] now renamed to CR6?  i'm not asking these
questions because i expect answers: i'm asking them to illustrate that
*it's far too complicated* and that we don't have time.

the fact that it's now *eight days* and i'm only just beginning to
understand the scope of the modifications you're proposing: that's
eight days *wasted*, not eight days gained.

we're not going to be able to get those eight days back.

> If we spend a little effort planning ahead

jacob: *we don't have the time*.  there's no budget, and there's no time.

> we can avoid a lot of the SIMD
> troubles with every future expansion

which will be in 4-5 years time *and we do not have time to go over
this right now*.

> requiring a whole new ISA which we're
> partially inheriting by having compiler-allocated 64-bit backing registers
> for vectors instead of RVV-style expand-as-big-as-you-please giant
> registers. The scheme I proposed is designed to handle expanding the
> register file to as big as we please (limited to powers of 2, of course) by
> interleaving more registers between the existing registers.

which is something that should have been proposed and discussed
*eighteen months ago*.

> It can also
> handle backward compatibility to both OpenPower v3.1 as well as versions of
> itself with a smaller register file by having setvli's extra bits switch
> the cpu to a mode where it skips the new registers when vectorizing
> instructions (basically changing the register number increment to 2^n
> instead of 1).

great.  so please be strict about this: document it, then forget it.

this is an important lesson for you, jacob, when it comes to project
management.  there are different phases and times.  new ideas -
redesigns - can be extremely disruptive if proposed at a point when
the context has moved on to one that's under time-crunch.

i have a map in my head of how things need to be implemented, i have a
map in my head of the design of SV, and the context *has* to move to
"implement this, right now, as fast as possible, as soon as possible".

trying to compare that *massive* map when i have both short-term
memory problems and a strange form of dyslexia, i can't handle it.  18
months ago, i could handle it: we weren't under time-pressure back
then.

so, to recap:

* incremental *small* ideas are not a problem (such as adding a new
peripheral instruction)
* major *necessary* ideas are a problem that we just have to suck it
up (such as adding CR support, which doesn't exist in RV)
* major disruptive ideas which take time to understand and evaluate:
document them, then forget them.  they can be revisited during
"re-evaluation".

l.