[libre-riscv-dev] gflops

Jacob Lifshay programmerjake at gmail.com
Sun Jul 28 14:02:30 BST 2019


On Sun, Jul 28, 2019, 05:55 Jacob Lifshay <programmerjake at gmail.com> wrote:

> > if we want to save area (which I think will probably not be necessary),
>
if the 2-stages per pipeline stage ends up killing our clock frequency, we
> could go with 1 radix-16 stage per pipeline stage (8 or 9 stages) and maybe
> the below option to reduce the pipeline length (4 or 5 stages, but half the
> div pipe throughput for >= 32-bit)
>
>> we
>> > could shrink the div pipe stage count by doubling the number of times
>> fp32
>> > and fp64 need to go through the pipeline to 2 and 4 times respectively:
>> > fp16: 24flops/clock/core -- 76.8gflops at 800MHz
>> > fp32: 10flops/clock/core -- 32gflops at 800MHz
>> > fp64: 4.5flops/clock/core -- 14.4gflops at 800MHz
>
>
clarifying: separate option from above

> one potential option is to have the div pipe normally use 2 stages per
> pipeline stage but to have (boot-time configured or at least requires a
> pipeline flush to switch) muxes to insert pipeline registers between
> compute stages to allow much higher frequencies (maybe 2GHz? -- not low
> power mode). we would still have the same number of reservation stations,
> so the pipeline utilization wouldn't ever reach 100%, but it seems like a
> very simple addition that would eliminate the main culprit for clock rate
> limitations.
>
> Jacob Lifshay
>


More information about the libre-riscv-dev mailing list