[libre-riscv-dev] div/mod algorithm written in python

Sun Jul 21 13:00:55 BST 2019

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sun, Jul 21, 2019 at 12:06 PM Jacob Lifshay <programmerjake at gmail.com> wrote:

> > the exponent operations would be (assuming inputs and outputs are
> > > biased and bias is positive):
> > > fdiv: nexponent - dexponent + bias (needs overflow/underflow handling)
>
>
> > from what i can gather, there's certain ranges that the mantissa has
> > to be placed into, and the result will come out "in the correct
> > range".
> >
> I mean the bias converting between the mathematical exponent and the
> unsigned integer stored in the exponent: 15 for f16, 127 for f32, 1023 for
> f64, and 16383 for f128.

 oh right - those are (ineptly named) FPNumBase.P127 / P128 / N126 / N127.

 for FP32 those are equal to: 127, 128, -126 and -127 respectively
 for FP16 they're calculated to come out at 15, 16, -14 and -15 respectively

 and so on.

 i appreciate the names are weird: it's a legacy hang-over from where
the code was originally handling FP32 *only*.  i created the
constants, then couldn't think of new names for them (not short ones,
anyway).

 so, now i understand what you mean.

 ok so if you're having to subtract the exponent bias, there's
something wrong (design-wise) [note: that's *completely* different
from checking that the exponent is within the *range* of the exponent
bias].

 none of the code - FPMUL, FPADD, FPCVT, or the FSM-based FPDIV, do
exponent addition/subtraction *at all*, i.e. it's *entirely*
"relative" computation, in the mantissa *only*.

 *BEFORE* [mantissa-based] computation, the exponent is checked
(special-cases).  *AFTER* [mantissa-based] computation, the exponent
is checked (rounding, and pre-packing overflow to make sure that,
after rounding, the result hasn't hit Inf]

 however *at no time* is there *ever* a case where the mantissa is
shifted *by* the exponent bias.  in jon dawson's "divide" code, the
result exponent is computed through subtraction:

      divide_0:
      begin
        z_e <= a_e - b_e;

and in multiply, it's computed by addition.  the only case where
things might _look_ like there's an exponent "bias" is in FCVT, where
the width of the target INT is stuffed into the exponent field and the
INT treated directly as if it was the mantissa.  so for UINT32 to FP,
you shove "32" into the exponent of an FPNumBase object, shove the
UINT32 into the mantissa of the same, and throw it at FPMSBHigh.
shifting occurs to get the MSB to 1, the exponent is subtracted by the
amount of shifting, ta-daaa, job done.

that however is a unique case.

basically what i'm saying is:

* a will have an exponent bias subtracted/added
* b will have the SAME exponent bias subtracted/added

therefore it is completely redundant to even *have* the exponent bias
added (or subtracted, or whatever).

the task is much, much simpler and straightforward than it seems.

> I'm assuming you didn't mean that we needed a 2048-bit wide mantissa (for
> f64) :)

noo, Nooo, noOO :)

but pretty close.  in the multiply case it's currently an absolutely
insane 108-bit wide (2x 53 bit plus a few).  that's absolutely got to
be fixed.  we can't possibly expect a single-stage 108-bit-wide
multiply to complete in any reasonable timeframe, let alone the gate
count.

l.