[Libre-soc-isa] [Bug 1184] Proposal for fixing XLEN: XLEN always is type-len, add FTYPE and FSTYPE for FP

Thu Oct 12 02:20:03 BST 2023

https://bugs.libre-soc.org/show_bug.cgi?id=1184

--- Comment #4 from Jacob Lifshay <programmerjake at gmail.com> ---
(In reply to Luke Kenneth Casson Leighton from comment #3)
> (In reply to Jacob Lifshay from comment #2)
> > another closely related issue, fp ops with different src and dest elwidth
> > end up double-rounding according to the current spec, which is bad.
> 
> good reason for programmers to avoid doing that by not using
> different widths, then, isn't it?
> 
> we are not here to "nanny" people [making hardware more complex
> in order to "protect" them from shooting themselves in the foot]

well, now that I think of it, we may be making the hardware more complex by
*not* avoiding double-rounding. e.g.:

sv.fadds/sw=f64/dw=f32 has to do:

convert f64-in-f32 sources to internal format
add sources
round result to f32 (as expensive as converting to f32 due to denormals)
convert f32 to internal format
round to f16 (as expensive as converting to f16)
convert f16 to f32

if we avoided double rounding, it would be:
convert f64-in-f32 sources to internal format
add sources
round result to f16 (as expensive as converting to f16)
convert f16 to f32

Note that when the inputs are the same type or strictly smaller than the
outputs, then there isn't a problem, because the extra conversions on the
inputs are exact and so we can just convert straight to the internal format
instead of doing two conversions.

So, what I think we should do about it:
I think we should just define as undefined-behavior or trap all FP operations
where the output type is not the same type as the intermediate type or the
input conversion is not exact. This leaves us free to define better semantics
later as another ISA extension without being a breaking change for SW.

e.g.:
* sv.fadd/sw=f32/dw=f64
  is defined since both the output and intermediate
  types are f64.
* sv.fadd/sw=f64/dw=f32
  is UB or trap since the output type (f32) isn't
  the intermediate type (f64).
* sv.ctfpr/sw=64/dw=f16
  is defined since the output defines the intermediate
  type since the input isn't FP.
* sv.fadds/sw=f32/dw=f64
  is defined since both the output and intermediate
  types are f64.
* sv.fadd/sw=f64/dw=f16
  is UB or trap since the output type (f16) isn't
  the intermediate type (f64).
* sv.fadd/sw=f16/dw=bf16
  is UB or trap since for the intermediate type being:
  * f16 -- the output type doesn't match the intermediate type
  * bf16 -- the input conversion isn't exact (f16 has more mantissa bits)

-- 
You are receiving this mail because:
You are on the CC list for the bug.