[Libre-soc-dev] Compressed instructions

Luke Kenneth Casson Leighton lkcl at lkcl.net
Sat Nov 21 15:56:43 GMT 2020


On 11/21/20, Alexandre Oliva <oliva at gnu.org> wrote:

> Some more thoughts on this issue...

it's addictive, isn't it? :)

> Maybe having the compiler default to generating compressed code, if
> targeting a CPU that supports it, and relaxing the requirements only
> upon finding e.g. too much register pressure, would make the problem
> more approachable.

i believe the algorithms in RVC gcc to be simpler than that (although
likely being advanced): the target regs of C are those most commonly
used, and non-targetted ones simply automatically result in the use of
32 bit instructions.

this is why RVC added LD/ST-stack opcodes which when the 32bit pattern
"LD RT, NN(r1)" is spotted it is naively replaced with "c.ld RT NN".

basically, "get the job done, get some benefits, optimise later"

> OTOH, I realize that a static choice of opcodes for this compressed
> encoding, aiming to use 50%-smaller instructions as often as possible,
> can make for code that compresses down to no less than 50% of the
> original size, probably more like between 60% to 70% for any nontrivial
> program.

pretty much on the nail there.  however the effect is a *square* law
effect on power reduction that cascades through L1 to L2/L3, reduces
TLB and MMU thrashing, and more.

approximately: a reduction in code size of 25% is a whopping *40*%
reduction in power consumption.

> Data compression algorithms, in turn, can shrink an executable, such as
> the ppc-gcc I used to test the 'objdump -M histogram' patch, down to
> some 30% of the original size.  Filesystems with transparent compression
> and MMUs can enable required pages to be decompressed on demand.  That
> doesn't help with the instruction cache,

L1 CAMs are one of the bigger consumers of power. that and regfiles.
ironic that moving data uses more power than actual computation.

>  but it does help even better
> than opcode compression in reducing permanent or volatile storage
> requirements.

the effect of latency on instruction and data compression should be
self-evident: efforts to add transparent L1/L2 cache compression at
the hardware level are complex enough to have been the subject of
entire dissertations.  we need to be pragmatic.



> marketplace of CPUs.  Is the data that pointed us this way readily
> available for my perusal, so that I can convince myself by studying it?

RISC-V RVC and other studies.  if you happen to find them please do
link them on the C wiki page.

l.



More information about the Libre-soc-dev mailing list