[libre-riscv-dev] [hw-dev] Re: 6600-style out-of-order scoreboard designs (ariane)

Samuel Falvo II sam.falvo at gmail.com
Wed May 22 17:12:38 BST 2019


I'm terribly sorry for digging far back into the history; I've been trying
to bring myself up to speed on scoreboards and such, and I'm lagging far
behind, in part because I never understood them to begin with and am new to
the technique, but also because I've just returned from a week of vacation.

(Which I already need another one, but I digress.)

On Thu, May 16, 2019 at 10:58 PM Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

> the FU-FU dependency matrix effectively has a *per FU* variant of the
> [global] read-pend and write-pend vectors which, due to them having
> their own latches, take a snapshot of the read/write-pending state *at
> that specific time*, for *that specific FU*.
>

This is an interesting interpretation which I'll need to meditate on to
fully understand; however, this is the first time I've come to have any
insight on why the FU-FU matrix is necessary.  This has been a huge
stumbling block for me, as up until the example with ADD r7, r7, ... was
given, I'd not understood why it was needed in the first place.  Thank you
for framing it in these terms.

Let me see if I understand correctly:

* the value proposition of the FU-FU matrix is *not* for the benefit of the
issue logic, but for the individual FUs themselves.  If the global vectors
all show R7 to be clear, then ADD R7,R7,R4 will happily issue; once issued,
however, now there's a self-inflicted deadlock on R7, preventing the state
machine from issuing GO_READ since it's waiting for something (anything!)
to write to R7.  But since there's a write reservation on R7, the issue
logic will never permit a writer to R7 to be issued.  With the FU-FU
matrices, the FU's GO_READ logic only concerns itself with its own cached
write reservation vector, which in this case will *not* have R7 set (since
it's a copy of what existed *before* issue).

* the global reservation vectors are *column-wise summations* of the
individual read/write FU-FU matrices, giving the issue logic the summary of
what's happening as of instruction issue time.  Because of this, they are
good enough to determine when to and not to issue to another FU.  As it
happens, this is about the only thing they're really good for.  All other
timing generation really depends on the FU-FU matrix.

* the individual FUs are only concerned with *their own row of the FU-FU
matrix* and are oblivious to all other rows, controlling timing and
arbitration to the common register file bus(es) based on that stored value
(a copy of the global vectors *as of when the instruction was issued*).

* Issue logic preventing issue when a (global!) write-after-write situation
is detected ensures the FU-FU matrix effectively represents a DAG,
preventing deadlock of the whole mechanism.  Put another way, it ensures
that for any bit in the FU-FU write reservation bits, one and only one row
with that bit set exists.

Is this a correct understanding?

Thanks for your patience.

-- 
Samuel A. Falvo II


More information about the libre-riscv-dev mailing list