Luke Kenneth Casson Leighton:
>     r0 += r1; \
>     r2 ^= r0; \
>     r2 = ROL32(r2, r3);

these are (using reg nums to indicate which REMAP source is needed)

    add r0, r0, r1            RT, RA, RB
    xor r2, r2, r0            RT, RA, RB
    rlwnm r2, r2, r3, 0, 31   rlwnm RA,RS,RB,MB,ME (Rc=0)

however they are all overlapping regs with the exception of RS, meaning
a sequence of svremaps is needed before each operation. slightly annoying
but doable.

example: RT and RA are REMAP0 in the add, but RT must be REMAP2 in the xor.
by turning things around they can line up, reducing the number of svremaps:

    svremap RT=0,RA=1,RB=0
    add r0, r1, r0            RT, RA, RB
    svremap RT=2,RA=2,RB=0    # RB stays = 0
    xor r2, r2, r0            RT, RA, RB
    svremap RS=2,RA=2,RB=3    # RA stays = 2
    rlwnm r2, r2, r3, 0, 31   rlwnm RA,RS,RB,MB,ME (Rc=0)

