[Libre-soc-bugs] [Bug 713] PartitionedSignal enhancement to add partition-context-aware lengths

Thu Oct 7 13:43:29 BST 2021

https://bugs.libre-soc.org/show_bug.cgi?id=713

--- Comment #27 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #23)

> why go through all that when you can just derive the signal width from the
> lane shapes -- avoiding wasting extra bits or running out of bits, as well
> as avoiding the need to manually specify size? Having the size be computed
> doesn't really cause problems imho...

because the idea is to do a literal global/search/replace "Signal"
with "PartitionedSignal" right throughout the entirety of the ALU
codebase.

and the input argument on all current-Signals is not an elwidth, 
it's an "overall width".

(actually, have a PSpec which dynamically determines at run/compile
time *which* class to use: Signal or PartitionedSignal).

when i said "there will be NO fundamental changing of the ALU code"
i REALLY meant it.

this is a hard critical requirement.  we do not have time to waste
on total disruptive redesigns of the existing ALU codebase, tens
of thousands of lines of work and just as many unit tests.

this type of "minimal code-changes" is something i learned the hard
way 24 years ago.

when transitioning to python programming it became even more important
because i witnessed massive python projects, 350,000 to a million lines,
literally falling apart because of low-level fundamental changes where
unit test coverage was either inadequate or in one particular
pretty-much-Corporate-suicidal case, non-existent.

that one was a bundle of fun: the CTO it turned out was sub-consciously
but actively sabotaging efforts to stabilise the codebase, because he
himself hadn't been able to stabilise it in 5 years, therefore he didn't
see why anyone else could either... oh and therefore made *certain* that
was true by changing low-level base classes and committing them to the
active development branch.

bottom line is: i am responsible for ensuring that the codebase is
stable, continuously-operational, and that changes are made with the
absolute bare minimum disruption and in an incremental non-disruptive
manner.

if we want a more complex design it *has* to be done in an *evolutionary*
fashion through a chained-series of *SMALL* incremental patches, one after
the other.

every time you say "but this feature would be great" i have to do a
MASSIVE re-evaluation of the knock-on implications down that chain,
which i am currently maintaining in my head.

this is why it is critically important that, because you haven't been
helping with the HDL for over 18 months, and aren't familiar with it,
to get back into the flow you help do the task that is needed, which
*i* define [i am quite happy to admit that i am slightly embarrassed
to have to put it like that].

if you had helped with the HDL codebase full-time, when i made the
announcement that TestIssuer was to begin development, you would now have
the working knowledge needed.

(In reply to Jacob Lifshay from comment #24)
> (In reply to Luke Kenneth Casson Leighton from comment #22)
> > (In reply to Jacob Lifshay from comment #20)
> > 
> > a quick readthrough:
> > 
> > > def layout(elwid, part_counts, lane_shapes):
> > 
> > using the FP example dict from comment #2, i think you might have
> > misunderstood:
> > the idea is not to create a set of *separate* PartitionPoints
> > one per elwidth, the idea is to create a *merged* set of PartitionPoints
> > (and associated Signal mask)
> 
> this is a misunderstanding, which i'm assuming by your later comments you
> figured out...

i hadn't.

> The code creates *one* PartitionPoints based on an input elwidth
> signal/value.

the usage for-loop confused me from the examples, when it set that
elwidth to zero width:

for i in range(4):
    pprint((i, layout(i, part_counts, unsigned(3))))

if i am reading you correctly, that's a zero-width Signal.

if the example had been:

for i in range(4):
    i = Signal(i) # Signal of width between 0 and 3
    pprint((i, layout(i, part_counts, unsigned(3))))

i would have understood.  maybe. and also wondered why a Signal of width
zero is being passed in.

(In reply to Jacob Lifshay from comment #25)

> I think we should keep those lines, since it allows us to completely specify
> all aspects of a PartitionedSignal's shape/layout/etc.

this defeats the object of the exercise, which is to show you how to do
incremental pragmatic development, and to keep "feet-on-the-ground".

> by giving it one
> Python value, lane_shapes. elwid and part_counts are retrieved from a
> global, they're the equivalent of SimdPartMode. The idea is that code will
> look kinda like this:
> self.elwid = Signal(ElWid)

for practical reasons this is always going to be Elwid=2.  i get
that it would be "nice to have" support for other data types...
.... but... they... are... not... critical.

if they need to be added, they can be added... *aaaafterr* the first
patch lands which adds support for setting Elwidth=2 as a hard-coded
parameter.

that would be a second incremental patch, and that's fine.
but it is NOT fine if it's done even before it's been demonstrated
to even be necessary.

understand?

> with set_global_in_scope(self.elwid, ElWid.PART_COUNTS):
>     ...
>     self.a = PartitionedSignal(3) # all lanes are unsigned(3)
>     self.b = PartitionedSignal(signed(5)) # all lanes are signed(5)
>     # all lanes are whatever shape StateEnum is
>     self.state = PartitionedSignal(StateEnum)

this is the general idea (but self.state *also* hidden behind the
ContextManager)

it needs to be more along the lines of:

with set_global_in_scope(pspec, self.elwid, ElWid.PART_COUNTS) as ps:

     self.a = ps.PSpecContextSensitiveSignalType(3) # all lanes are unsigned(3)

where from pspec the *actual* type of Signal to be deployed is picked up.

that can be **EITHER** a PartitionedSignal (compile-time option 1)
**OR** it can be Signal                    (compile-time option 2)

hence why the arguments need to be exactly the same, and why the dsl.Module
needs to be exactly the same, and why ast.* need to behave exactly the
same.

remember this has 18 months of planning behind it.

there *will* be *zero* code-modifications to turn the entire ALU set
to SIMD.  (see below: if that turns out not to be true, we *make* it
true.  i suspect there are some places where ALU changes might indeed be
needed, sigh, i'm *hoping* this turns out not to be the case).

>     self.mantissa = PartitionedSignal({
>         ElWid.F16: 5, # from memory, probably off

:)

>         ElWid.BF16: 8,
>         ElWid.F32: 8,
>         ElWid.F64: 11,

and yes, this could be one of the exceptions, although i am also hoping that
this can be "derived" from the existing (one-line?) dictionary
that sets the *scalar* FP pipeline types.  i forget exactly where it is.

> In particular, notice that you can use nmigen Shapes without needing to
> write a dict everywhere, greatly reducing the modifications needed to our
> existing ALUs/FUs/etc.

except that standard nmigen Signal doesn't *support* those advanced features,
neither does the existing codebase use them, therefore it is completely
unnecessary to add this as a feature *which we're not going to use*, it is
a waste of time.

even in the remote possibility that my 18+ month experience of having been
the person involved in the development of every single pipeline was not
correct, invalidating that assessment, there are two possibilities:

1) we re-evaluate, and we re-evaluate *at the time that it is needed*

   (which saves wasting time developing something that's not needed,
    which will only confuse other people "why is this here when it's
    not needed or used? it's complicated and unnecessary")

2) we make the tough decision to break the rule "no changes to any
   pipeline code" and AVOID the need entirely.

   (making sure in the process that at every single step of those
    changes that the code still operates when PSpec "mode==scalar,
    please use Signal not PartitionedSignal throughout the entire
    ALU codebase" is set)

bottom line this is all about making the bare minimum code-changes,
across what is already an extremely complex set of inter-connected
parts.

-- 
You are receiving this mail because:
You are on the CC list for the bug.