[Libre-soc-dev] Compressed instructions

Sat Nov 21 15:16:16 GMT 2020

On Nov 21, 2020, Alexandre Oliva <oliva at gnu.org> wrote:

> I don't wish to come across as too negative, but it makes me wonder if
> the potential savings this feature will enable won't just go to waste
> because the tools won't be able to take advantage of them.  I wonder if
> it wouldn't make more sense to save this idea for a future development,
> rather than in the critical path for the very first product.  I can see,
> however, how much of a breakthrough it can be, and how compelling it can
> make the processor, if the potential is realized.

Some more thoughts on this issue...

Maybe having the compiler default to generating compressed code, if
targeting a CPU that supports it, and relaxing the requirements only
upon finding e.g. too much register pressure, would make the problem
more approachable.

OTOH, I realize that a static choice of opcodes for this compressed
encoding, aiming to use 50%-smaller instructions as often as possible,
can make for code that compresses down to no less than 50% of the
original size, probably more like between 60% to 70% for any nontrivial
program.

Data compression algorithms, in turn, can shrink an executable, such as
the ppc-gcc I used to test the 'objdump -M histogram' patch, down to
some 30% of the original size.  Filesystems with transparent compression
and MMUs can enable required pages to be decompressed on demand.  That
doesn't help with the instruction cache, but it does help even better
than opcode compression in reducing permanent or volatile storage
requirements.

As portable devices carry tens or even hundreds of GBs of storage, and
often support transparent compression and decompression, I'm not so sure
how much benefit a compressed instruction encoding will bring there; in
theory, the 16-bit insns carry no less information than the equivalent
32-bit ones, so a perfect data compression model would bring them down
to about the same size in storage, and we would go through transparent
decompression anyway.

Since we've decided to go ahead with it, I suppose we have a strong
sense that the above doesn't quite make up, and that a more compact code
representation can still be a significant differentiator in the
marketplace of CPUs.  Is the data that pointed us this way readily
available for my perusal, so that I can convince myself by studying it?

Thanks,

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist         GNU Toolchain Engineer
        Vim, Vi, Voltei pro Emacs -- GNUlius Caesar