[libre-riscv-dev] [isa-dev] FP reciprocal sqrt extension proposal

Guy Lemieux glemieux at vectorblox.com
Thu Jul 11 21:42:15 BST 2019


1/sqrt(a) is a single-operand instruction.

might there be more performance value in making it dual-operand to make
better use of available read ports, eg:

a/sqrt(b)
  or
1/sqrt(a+b)

both are common forms of usage. i suppose these could be formed by
chaining, but if that’s the case there’s little need for rsqrt if you have
both div and sqrt.

guy


On Thu, Jul 11, 2019 at 10:25 AM Bill Huffman <huffman at cadence.com> wrote:

> Avoiding doing both a square root and a divide to get this is a
> worthwhile goal.   I do wonder if it's better to have a slightly looser
> accuracy requirement.  The instruction can be considerably faster if
> it's required to be within 1-ulp than if it's required to be correctly
> rounded.  On the other hand, that means different implementations get
> different answers, which is not so good.
>
>        Bill
>
> On 7/11/19 3:45 AM, Jacob Lifshay wrote:
> >
> >
> > I propose a Zfrsqrt extension that consists of the frsqrt.s, frsqrt.d,
> > frsqrt.q, and frsqrt.h instructions, where the 32-bit, 64-bit,
> > 128-bit, and 16-bit versions require the corresponding F, D, Q, etc.
> > extensions.
> > If only the F and Zfrsqrt extensions are supported, then only the
> > frsqrt.s instruction is supported.
> > If only the F, D, and Zfrsqrt extensions are supported, then only the
> > frsqrt.s and the frsqrt.d instructions are supported. Likewise for
> > frsqrt.q and frsqrt.h requiring the corresponding extensions enabled.
> >
> > The operation implemented by frsqrt.* is a correctly-rounded
> > implementation of the IEEE 754-2008 rSqrt operation, with all the
> > usual FP rounding modes supported.
> >
> > For the encoding, I think using an encoding similar to both the
> > fsqrt.* and fdiv.* encodings is a good idea, since frsqrt is similar
> > to both fdiv and fsqrt; Therefore, as an initial proposal, I think
> > using a funct7 value of 0111100 and the rest of the instruction
> > identical to fsqrt is a good idea, since, as far as I'm aware, that
> > doesn't conflict with anything currently.
> >
> > We (libre-riscv.org) are currently planning on implementing frsqrt in
> > our libre GPU, since frsqrt is a common operation in 3D graphics (used
> > for vector normalization, among other things).
> >
> > Comments, modifications, etc. welcome.
> >
> > Jacob Lifshay
> >
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+unsubscribe at groups.riscv.org.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/bb41761b-4fc4-f636-cf77-c0dd216d41b2%40cadence.com
> .
>


More information about the libre-riscv-dev mailing list