[Libre-soc-bugs] [Bug 238] POWER Compressed Formal Standard writeup

Sun Nov 29 06:10:47 GMT 2020

https://bugs.libre-soc.org/show_bug.cgi?id=238

--- Comment #92 from Alexandre Oliva <oliva at gnu.org> ---
> that's 8192 MUXes which are.. what... 5 gates each?  that's a whopping 40,000 gates which, to give you some idea of scale, alexander, is 3x times larger than a 64-bit multiplier.

fortunately, that's *way* exaggerated.

that's because (i) the shifters for the earlier instructions only have to look
at initial bytes, and (ii) the shifters for the latter instructions only have
to look at the few actual possibilities that can be reached by prior-insn
combinations

Say we're looking at a buffer of 32 bytes, and instructions can be at most
8-bytes long.  Can one of the 4 instructions we're trying to process start at
byte 25, or any later byte?  Certainly not.  The latest potential starting
position is byte 24, and that's only reachable at all if the opcodes at 0, 8
and 16 encode 64-bit instructions.

Odd bytes after 17 don't even have to be looked at.  byte 15 doesn't either.

And then, the even/odd distinction enables us to further reduce that ballpark,
because it reduces potential encoding ambiguities between 16- and 32-bit
opcodes, which I believe reduces the number of gates.  IOW, tt's not so obvious
to me that it actually makes things worse.

But yeah, it's a big muxer, with 257 inputs (the initial execution mode
matters) and 256 outputs (assuming compressed insns are mapped to their
uncompressed versions by this very muxer, otherwise add a mode bit to each insn
for the remapper to know whether it needs further decoding).

Alteranatively to 257 inputs, we might keep compressed-mode insns always at odd
bytes in the input to the muxer, by keeping a mode-switching nop as the first
byte in the input buffer as long as we're in compressed mode.  The muxer could
very well manage to ignore it entirely, instead of filling in the first decoded
slot with a nop.  Indeed, we could configure the muxer to skip however many
mode-switching nops as it makes sense, though each additional one adds some
gates, unlike (ISTM) the first one.

All that said, I haven't done Karnaugh map minimizations for hardware purposes
in a long time, and never such massive ones, so my intuitions may be way off. 
I'd have to work it out more thoroughly to trust them ;-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.