[libre-riscv-dev] GPU design

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Dec 4 15:21:54 GMT 2018

also - sorry, this is occurring to me quite quickly, now, exploring
the ROB bit-per-byte "predication" augmentation of the destination ROB
field: i believe it may be possible to group SIMD results together, as
long as it's permitted to specify more than one ROB on the CDB.

it goes like this:

ROB #1: dest=r5 bytemask=0b11000011 ADD SIMD 16-bit
ROB #2: dest=r9 bytemask=0b00110000 ADD SIMD 16-bit

that will obviously have _two_ reservation station entries in the ADD
FPU queue.  note that the bytemasks do *not* overlap.  so, what's to
stop the two operations being executed by the same ALU, at the exact
same time? absolutely nothing... as long as the *two* dests
(r5/0b11000011 and r9/0b00110000) are passed through the ALU and
broadcast on the CDB.

what i wondered was, if it would be possible to reduce the amount of
information passed on the CDB?


More information about the libre-riscv-dev mailing list