[Libre-soc-dev] FPMUL32 rounding

Luke Kenneth Casson Leighton lkcl at lkcl.net
Tue Jun 8 13:21:32 BST 2021


---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


On Fri, Jun 4, 2021 at 3:05 AM Luke Kenneth Casson Leighton <lkcl at lkcl.net>
wrote:

>
> On Fri, Jun 4, 2021 at 1:34 AM Jacob Lifshay <programmerjake at gmail.com>
> wrote:
>
>
>> I used the test case you have in that unit test, demonstrating that doing
>> the multiplication as f32 gives exactly the same answer as doing the
>> multiplication as f64 then rounding to f32 (this only works for some fp
>> ops):
>> See the fmuls_1 and fmuls_2 functions:
>> https://rust.godbolt.org/z/6asaf6Mq6 (note the dbgf! macro prints the
>> input
>> then returns it unmodified)
>>
>> I also tested on a Power9 and it gives identical results.
>>
>
ok i tracked it down.  the "broken" code uses the Load/Store SINGLE/DOUBLE
conversion back-to-back.  sounds perfectly reasonable: it's a FP32 in the
space
of an FP64, therefore truncate the result to FP32 then re-expand it to FP64,
right?

wrongggg... this process of truncation, because it was calculated with
potential rounding, *ignores* the lower bits.

from the spec, p152, v3.0B;

are then added or subtracted as appropriate, depend-
ing on the signs of the operands, to form an intermedi-
ate sum.
*All 53 bits of the significand as well as all threeguard bits (G, R, and
X) enter into the computation.*

that's for *fadds* - not fadd.

interestingly, fmuls does not make a similar statement.  however,
clearly, when calculating FP32xFP32 the resultant mantissa
is at... errr... 24+24+guardbits, and if those bits are IGNORED
then we get the rounding problem.

i've just added this, which is the same pseudo-code for FRIN
(from Appendix A.1), it's currently incomplete:
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=af091265bb3c5327479f414c1bff7b81da0c6cd2

and the problem has gone away.  Lauri as long as you do not
put in any numbers that overflow (which i do not believe there
are in any of the MP3 tests) the current pseudocode should
be sufficient.

l.


More information about the Libre-soc-dev mailing list