[Libre-soc-dev] effect of more decode pipe stages on hardware requirements for execution resources for OoO processors

Jacob Lifshay programmerjake at gmail.com
Wed Feb 16 01:09:18 GMT 2022

On Tue, Feb 15, 2022, 16:41 Luke Kenneth Casson Leighton <lkcl at lkcl.net>

> On Wed, Feb 16, 2022 at 12:15 AM Jacob Lifshay <programmerjake at gmail.com>
> wrote:
> 1) the decode stage take 8 stages and the execution only 1.
> 2) the decode stage is 1 clock and execution only 1.

ok, done, under the "8 decode stages, 8 wide" and "1 decode stage, 8 wide"


making it 8-wide exposes the loop-carried dependencies on ctr and the
address in r3, making the loop max out at 4 instructions per cycle despite
the larger fetch bandwidth.

as I stated earlier, and as you can see from inspecting the tables, 1 vs. 8
decode stages still doesn't affect it.

btw i'm generating the tables using https://ftp.libre-soc.org/power-cpu-sim/
in markdown output mode (add --help to the command line to see options) and
manually editing decode pipe stages into the tables, then reformatting with


More information about the Libre-soc-dev mailing list