[Libre-soc-dev] effect of more decode pipe stages on hardware requirements for execution resources for OoO processors

lkcl luke.leighton at gmail.com
Wed Feb 16 10:31:49 GMT 2022


On Wed, Feb 16, 2022 at 6:02 AM Jacob Lifshay <programmerjake at gmail.com> wrote:

> yup, that's obviously wrong imho, since those reservation stations aren't required by instructions in the decode pipe, so having more instructions in the decode pipe doesn't require more reservation stations.

it isn't.

> that's true, but only for the stuff after the fetch/decode pipe, everything in the fetch/decode pipe doesn't (or at least usually shouldn't) use any RSs or DMs.

it's totally relevant! you cannot focus on one small part of the
system then claim that its implementation has no effect on the whole
system!

> now it runs until it runs out of hardware registers in the rename stage (equivalent of running out of reservation stations), since I only gave it 32 registers. it puts the stalled instructions into a FIFO queue between the decode and rename stages (quirk of my program, i don't want to take the few hours to fix it to stall fetch/decode) -- if you like, you can mentally adjust what would happen if fetch/decode stalled instead of filling the queue.

no: the specific circumstances being tested *are* - and i have
repeated this four times in about 45 minutes yesterday and again today
- precisely to log-jam the reservation stations.

let us start again.

let us use mathematical notation

"For the infinite set of all possible instruction sequences, there
exists a sequence of 40 instructions such that there are 39 RaW-WaR
hazards between each pair in sequence such that 40 RSes *are* required
to hold the full chain"

let us call that Chain40

the *actual* instructions within Chain40 are completely irrelevant.
it is the fact that there *is* a chain that is the sole exclusive
critical fact.  please do not place or create barriers or argue with
the fact that such a chain exists, nor argue or advocate any
additional circumstances which make Chain40 a non-possibility.

now let us also create some additional groups:

"For the infinite set of all possible instruction sequences, there
exists 40 instruction sequences of length 1 (one), such that they have
no Hazards at all onto Chain40 *and* have no Hazards with each other"

let us call those NonChain1-40

so:

* there are 40 instructions in a chain of 39 hazards with each other,
called Chain40
* there are 40 instructions with *no* Hazards either on each other or
with Chain40, called NonChain40

now let us define the hardware:

* let the pipeline depth be 2 for ALL instructions
* let the instructions to be executed be: Chain40 followed by
NonChain1...NonChain40
* let us assume 100-way multi-issue (please do not argue that this is
impractical at this point in time)

QUESTION: how many Reservation Stations are required to ensure that an
issue-stall does not occur?

now let us change one parameter:

* let the pipeline depth increase to 10

QUESTION: what effect does this have on the number of Reservation
Stations required?

l.



More information about the Libre-soc-dev mailing list