[Libre-soc-dev] v3.1B prefix
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Dec 14 07:18:13 GMT 2020
On 12/14/20, Alexandre Oliva <oliva at gnu.org> wrote:
> 32-bit uncompressed instructions: 109169
> 16-bit compressed instructions: 7732
> 16-imm compressed-mode instructions: 14509
> 10-bit compressed instructions: 4057
> 10-bit mode-switching nops: 3398
> 10-bit mode-switching nops for imm-16: 10821
> 16-bit mode-switching nops after imm-16: 1876
> 10-bit nop+16-imm pairs above, backtracked to 32-bit: 9171
> Compressed size estimate: 521462
> Original size: 541868
> Compressed/original ratio: 0.962341
drat. ah btw was that with the better reg nums i found ? i used
insn-histogram.py and it gave about a 3% improvement. just committed
although i just realised that the 4 reg versions need to be on the
best 4 *relevant* regs not the global top 4 regs.
> Nearly 20% of the instructions can be compressed,
interesting. that would be great if it could be met all rhe time.
> lost because of the need for explicit mode switching. We just don't
> have enough compressible insn density to get much better than
that was always the concern with the versions that used the 10bit to
do a "countdown" (next N ops are 16bit)
what's the statistics on the number of consecutive Compressed opportunities?
if the number of consecutive C ops averages above 4 then it's worth
wasting 1 16bit slot just to set 16bit mode for up to N instructions.
> But then again, there are too many uncompressible insns among them to
> make the mode switching worth it.
damn damn damn i suspect this is why VLE invented an entire new encoding.
> My hunch is that we'd have to more than double the percentage of
> compressible insns to make compressed mode shine.
this sounds about right.
i have an idea though.
what if we said that there were 4 bits per reg (16 possibles), and
that the transition to 16bit was a "countdown" of up to 8 ops?
or, because there are 10 bits available, how about 3 3 3 so:
* up to 8 16bit ops
* up to 8 32bit ops
* up to 8 16bit then finally drop out
the 2 bits currently taken with N and M can be assumed to be allocated
to increasing 3bit reg#s to 4 bit for now.
yes it is not perfect, some ops are src1 src2 dest, they can't all be 4 bit
More information about the Libre-soc-dev