[Libre-soc-dev] adding PartitionedSignal support to nmigen's If/Switch/Case

lkcl luke.leighton at gmail.com
Tue Sep 21 22:26:03 BST 2021



On September 21, 2021 7:50:29 PM UTC, Jacob Lifshay <programmerjake at gmail.com> wrote:
>Generating a bunch of Ifs, one for each maximally-partitioned
>PartitionedSignal lane will theoretically work for If, though it will
>be
>incredibly inefficient

genuinely don't care.  it allows us to make forward progress where we are seriously behind.

> and i don't think yosys will have an easy time
>optimizing it back to sane levels

also don't care.  that's for a future Grant Request, one dedicated to optimisation.


>. There is also the problem that all
>code
>inside the If will still need to be vectorized (not a simple
>process)...

dead simple.  replace all Signal with PartitionedSignal.

done.

can be done with a global search/replace.

then, back in the Pipeline API you propagate a Partition, which is used to initialise all Stages.

this should also be reasonably easy to do in a regular fashion.

i have been planning this for a long time, Jacob.

will this trick work in all cases? no it won't, and that's ok.  we adapt as it goes along.


> essentially ending up with a very similar problem to the
>original idea of just reimplementing yosys's process passes.

nope.

every ALU's current Signal() usage is replaced with PartitionedSignal.

there are a few gotchas, but not many.

one of them is that, as i hinted at the start of the thread, the mantissa and exponent size for FP will change depending on the Partition size.

53 bits mantissa for 64 bit

24 bit mantissa for 2x32 bit

10 bit mantissa for 4x16 bit

that means that it is nowhere near as straightforward as it is for e.g. the Integer add.

consequently there will be a little bit of... shenanigens needed, to be able to select the right number of bits to copy, depending on the Partition State.

even if that has to be done manually, i don't have a problem with it.

>essentially...we will need to have the same
>vectorization/control-flow-into-data-flow-conversion code, the code
>just
>has a slightly different output format.

all taken care of by the simple process of using PartitionedSignal everywhere, and following some conventions [e.g. PartitionedMux has all bits set or all bits clear]

PartitionedSignal.__eq__ and ge and le have *already been written* to generate all-set or all-clear within partitions, as opposed to setting one single bit.

this was all planned and executed well over 18 months ago.



>However, beyond even the difficulties of If, Switch/Case has the
>additional
>problem of deciding what code like the following means:
>v = make_partitioned_signal() # assume it can be partitioned to at
>least
>4x8-bit or 1x32-bit
>with m.Switch(v):
> # per-lane matching, or matching whole 32-bit word, or something else?
>    with m.Case('0101---- 11110000 -1-1-1-1 00000000'):
>        m.d.comb += x.eq(3)

initially i thought this was out of scope due to the majority instances being Signals, e.g. m.Switch(ctx.op) m.Case(OP_ADD) and m.Case(OP_XOR) and so on.

i remembered however that there is a length-based switch in e.g. OP_POPCOUNT.

but...yeah, even there, it is still part of the operation, and it is prohibited to have the ALUs run different instructions, therefore, again, it doesn't arise.

have to keep an eye on that one, but i have no issue with ripping out problematic m.Switch statements and replacing them with PartitionedSignal-aware m.If statements.

even if 10% of the code gets modified it's still better than chucking out 100%

l.



More information about the Libre-soc-dev mailing list