[Libre-soc-dev] v3.1B prefix

Jacob Lifshay programmerjake at gmail.com
Mon Dec 7 10:08:01 GMT 2020

On Sun, Dec 6, 2020, 23:53 Alexandre Oliva <oliva at gnu.org> wrote:

> Even with all these opcodes put in, the compression rate for gcc's text
> segment was less than 3%.  It's not looking incredibly compelling :-(

Was that just the driver or cc1plus? (iirc the driver just runs other
programs but doesn't do a whole lot itself, so may be less representative
of code)

I'd guess your program may be unintentionally broken, though I didn't find
any major flaws from my initial read-through.

Some stats for a slightly modified 10-bit encoding:

I wrote a bash script:
(objdump -M raw --no-show-raw-insn -d --section=.text /bin/python3 |
sed 'y/\t/ /; s/^.*: *//;
s/  */ /g; /^$/d; /\.long/d; s/ori *r0, *r0, *0/nop/; s/(r1)/(r at 1)/g;
s/(r at 1)/(r1)/g; s/\(ld\|std\) *r3,
s/\([, ]\)\([0-3]\|-[1-4]\)/\11/g; s/^@ *\([a-z][^ ]*\) .*/\1/' |
tee >(wc >&2) | sort | uniq --count | sort -n) |& less

821272 1737711 16873249
   7656 bclr 10,lt,1
   7930 mfspr r3,8
   9309 mtspr 8,r3
  10327 std-stack
  11519 addi r9,r9,1
  13806 cmpi cr3,1,r9,1
  14765 addi r3,1,1
  14891 ld-stack
  17360 mr r3,r3
  20675 cmpi cr3,1,r3,1
  46688 nop

Assuming all nops are required for ABI purposes and can't be compressed,
some of the top instructions that would easily fit in the 10-bit format are:
mr r3, r3 ; 3 bits for each reg field
ld-stack ; 3 bits for dest reg field and 3 bits for immediate
addi r3,1,1 ; really addi r3,0,immed -- 3 bits for reg and 3 bits for
std-stack ; 3 bits for src reg field and 3 bits for immediate

putting just those 4 together:
(17360+14891+14765+10327)/821272=0.0698=6.98% of all instructions
since the 10-bit format works absolutely everywhere, we get a guaranteed
size reduction of at least 6.98%/2=3.49% and that's using just 1/4 of the
10-bit space (2x 3-bit fields + 2 bits for opcode) and none of the
16-bit-only space.


More information about the Libre-soc-dev mailing list