[libre-riscv-dev] GPU design
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Tue Dec 4 15:21:54 GMT 2018
also - sorry, this is occurring to me quite quickly, now, exploring
the ROB bit-per-byte "predication" augmentation of the destination ROB
field: i believe it may be possible to group SIMD results together, as
long as it's permitted to specify more than one ROB on the CDB.
it goes like this:
ROB #1: dest=r5 bytemask=0b11000011 ADD SIMD 16-bit
ROB #2: dest=r9 bytemask=0b00110000 ADD SIMD 16-bit
that will obviously have _two_ reservation station entries in the ADD
FPU queue. note that the bytemasks do *not* overlap. so, what's to
stop the two operations being executed by the same ALU, at the exact
same time? absolutely nothing... as long as the *two* dests
(r5/0b11000011 and r9/0b00110000) are passed through the ALU and
broadcast on the CDB.
what i wondered was, if it would be possible to reduce the amount of
information passed on the CDB?
More information about the libre-riscv-dev