[Libre-soc-dev] bigint-presentation reg alloc and cranelift's reg alloc
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Mon Jan 9 19:50:31 GMT 2023
On Monday, January 9, 2023, Jacob Lifshay via Libre-soc-dev <
libre-soc-dev at lists.libre-soc.org> wrote:
> I started a discussion about how I used a technique (using the reg
> allocator to alloc stack slots) that some people in cranelift wanted to
> use, and ended up talking about register allocators and their relation to
> SVP64, since SVP64 requires allocating ranges of registers instead of just
> individual registers like nearly every other ISA:
yes, it's important to note that the technique of having a
bitmask that extends down to sub-portions of the register file
is *not* limited to SVP64.
ppc64 could (slightly, indirectly) benefit by virtue of
partial use of FP regs embedded in VSX regs.
AMDGPU could benefit because texturisation shares multiple
sequential registers (up to 12 iirc) and also shares FP32/FP64.
any other architecture which has multiple uses: if this
discussion was taking place in the late 90s i would say
MMX was also important, although i don't believe that
had different lengths (or predication)
also it MIGHT be beneficial on architectures with predication,
where the hardware is smart enough to spot complementary
(inverted) predicate masks. this however is a big ask as
it requires hardware architects to give out information
that they normally only reveal under NDA. or, it needs
some reverse-engineering. if inverted predicate masks
are used on two successive identical operations and they
complete quicker than overlapping predicates, you can
deduce that the operations have been macro-op fused
together and thrown at the same parallel backend hardware.
yes, this is something i have been mulling over for an
advanced SVP64 implementation for about 2 years but have
not shared publicly until now.
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
More information about the Libre-soc-dev