[libre-riscv-dev] pipeline stages controlling delays

Fri Apr 5 09:01:53 BST 2019

On Fri, Apr 5, 2019 at 7:01 AM Jacob Lifshay <programmerjake at gmail.com> wrote:
>
> On Thu, Apr 4, 2019, 22:30 Luke Kenneth Casson Leighton <lkcl at lkcl.net>
> wrote:
>
> > On Fri, Apr 5, 2019 at 6:06 AM Jacob Lifshay <programmerjake at gmail.com>
> > wrote:
> > > that's why I had put the control signals as members of Stage.
> >
> > ... which isn't needed in the majority of cases, hence why i separated
> > them.
>
> which is why I made CombStage, to handle all the boilerplate code needed
> for the most common cases.
>
> > plus, if they're part of the Stage, the temptation will be to
> > override the *entirety* of the logic (or, worse *require* overriding
> > all of the logic).
> >
> which is entirely feasible in the code I wrote because of separation of
> concerns between RegStage and CombStage, so every Stage instance has a
> uniform ready/valid interface, whereas BufferedPipeline and
> UnbufferedPipeline make Stage have an interface that changes semantics
> depending on how it's used.

 that's a mistaken impression: in the pipeline API, a Stage is
entirely independent of how it's used, and there are five ways:

 * StageChain
 * BufferedPipeline
 * UnbufferedPipeline
 * CombMultiInPipe
 * CombMultiOutPipe

as the Stage API is entirely independent of the way it's used, there
are *additional* options here as well which may be added at some
point:

 * FSM (state machine)
 * John Dawson style STB/ACK
 * Global Valid
 * "Travelling CE"

the latter two are described here
https://zipcpu.com/blog/2017/08/14/strategies-for-pipelining.html

*none* of those require the Stage to know *anything* about what it is
that they're actually being deployed in.  at all!

so it's a mistake to believe that the interface to Stage has to change
depending how it's used: the stage doesn't change *at all*.  all it is
responsible for is "creating the next bit of data [in the specified
output format] from the given input [in the specified input format]".

that's its sole exclusive responsibility.

(or, it was: now it is responsible for dynamically indicating whether
it is ready to receive data, and to dynamically indicate whether its
output is ready.  however, again, that is related to the *data*, which
has nothing to do at all with the *transfer or handling* of that
data).

> >  what i believe all that is needed is just to create a surgical break
> > in the p_o_ready and n_o_valid signals, and provide a couple of
> > functions (in the stage instance) to connect the pairs of signals
> > together.
> >
> Note that the BUF gates that yosys shows between wires should be optimized
> out by the synthesis toolchain, so there is no penalty (except increasing
> compile time by a little) to having long chains of signals that are wired
> directly. That's part of why there is no problem routing ready/valid
> through every stage instance.

 the BUF gates get optimised out, however the actual chains of
ready/valid signals do not, resulting in a gate cascade.  in
particular, when a stage contains pause/stall logic, that gets
inserted into the chain as well.

 now, CombStage, _yes_, if several of those are chained together, as
long as there is no pause/stall logic involved, the ready_out=ready_in
and valid_out=valid_in will be optimised out...

 ... except look at StageChain.  there *is* no ready/valid chain to
create... because Stages *do not know about the signalling logic* and
they do not have to.

 except... *sigh*, now with the addition of ready/valid logic
signalling by a Stage, that has to be thought through.  raising an
exception (preventing StageChain from accepting a Stage with
ready/valid logic) for now is the simplest option.

> >  the Stage doesn't need to know, and definitely doesn't need access
> > to, the p_i_valid or n_i_ready for example.  if those are part of the
> > Stage, the Stage *will* have access to them, and may interfere
> > unnecessarily with their operation.
>
>
> > or, the end-user might *think*
> > they have to interfere with their operation, because it's part of the
> > "exposed API".
> >
> that's why the documentation should/will explicitly state that CombStage
> handles prev/next ready/valid.
> If a stage does something special with ready/valid, then it shouldn't be a
> CombStage.

 jacob, it's just not as flexible an API [and not actually *having*
the stuff there is infinitely better than writing documentation (which
people may or may not read) saying "don't use this"].

 bear in mind: what you've written will need a complete redesign to
allow the dozen different types of data format options currently
supported by the pipeline API, and a yet further redesign to allow for
use in a FSM, Global CE, travelling CE or STB/ACK handling
arrangement.

 by the time both of those near-total redesigns are done, the end
result will be a direct functional equivalent of the *existing
proposed pipeline API*

> >  hmm... given that the pairs of signals need to be treated identically
> > in all cases (i think),  it *might* even be possible to just return
> > code-snippets (similar to how process() returns something that is eq'd
> > into the register) and have ControlBase not even pass the pairs of
> > p_o_ready and n_o_valid signals into the Stage.
> >
> I think that may be an unnecessary complication.

 it's a topological functionally-equivalent code-morph (even before
it's been written, i know...).

i was going to say that it seems quite restrictive, to have a
Stage.n_o_valid() and Stage.p_o_ready() function, however thinking it
through (a little), i believe that the Global CE, travelling CE and
STB/ACK could all, also, make use of the same Stage valid/ready
functions [without modification].

 basically, "data being ready" and "data being valid" are indicators
that are related to *data*... *not* to "how data is transferred".

 as such, it's an appropriate OO design break/separation point.
having the handling *of* the data *and* the data *in* the same class
is a major limitation.

 "Stage" could be renamed to "DataHandler" or "DataManager" to make
that clearer.  i like "Stage" because it's... well... shorter.

> If a stage needs to handle
> ready/valid signals in a way that's different, it should just handle them
> directly. We could make a helper class FSMStage if that makes it easier.

 so that the stage which handles the data, and the stage that handles
the *stalling* of the data, are separate, and may be chained together
(combinatorially or otherwise)?

 if so: i thought about that, and realised that it doesn't help [note
it would need to be *three* "stages": one that handles the indicator
to previous, "middle is ready to receive", and one that handles the
indicator to next, "middle's output is valid"].

 it doesn't help.... because the interface between 1st-and-middle and
middle-and-3rd are exactly and precisely what would be needed to
support a *single* stage instance that had dynamic ready/valid
capabilites.

or did i misunderstand?

l.