[Libre-soc-bugs] [Bug 558] gcc SV intrinsics concept

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Mon Dec 28 20:41:28 GMT 2020


https://bugs.libre-soc.org/show_bug.cgi?id=558

Jacob Lifshay <programmerjake at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |programmerjake at gmail.com

--- Comment #2 from Jacob Lifshay <programmerjake at gmail.com> ---
I think the intrinsics should be designed differently and share their syntax
with the clang version:

I designed the following based on the LLVM IR llvm.vp.* intrinsics, which will
most likely be used for some of SV in LLVM:
https://llvm.org/docs/LangRef.html#vector-predication-intrinsics

add vector types using whatever __attribute__ magic you prefer (can reuse the
vector_size attribute if preferrred):
template<typename Base, std::size_t MAXVL, std::size_t SUBVL>
typedef Base Vec[MAXVL][SUBVL] __attribute__((magic...));

Base is limited to [u]int8/16/32/64_t, pointers, __fp16, float, double, and
__bf16.

Through compiler magic (or being instead defined as a struct), Vec acts like a
struct in that you can assign it, pass it by value, it doesn't decay to pointer
types in function arguments, etc. It has the same in-memory layout as the above
array with no padding and alignof(Base) alignment.

Now you can create a max-4 element vector with floatx3 subvectors by writing
Vec<float, 4, 3>

Like the currently existing vector_size attribute, you can just use the
attribute to convert the array type to a SV vector type, you don't need it to
be a template or match the above definition.

E.g. for C, you could write:
typedef float floatx3xmax10[10][3] __attribute__((magic...));

and now floatx3xmax10 is a SV vector with maxvl=10 and subvl=3 and an element
type of float.

there is also typedefs:
typedef size_t vl_t;
typedef uint64_t mask_t;

it is legal to convert between size_t and vl_t, setvl instructions will be
inserted by __sv_add and friends if needed.

All operations other than assignment, parameter passing, and indexing are taken
care of by built-in functions:

// returns the computed VL
vl_t __sv_setvl(size_t vl, size_t maxvl); // maxvl must be a compile-time
constant

all computation functions take vl as a parameter. it is Undefined Behavior if
vl > MAXVL, vl == 0 is legal. all computed vectors have uninitialized contents
for elements > vl unless otherwise specified.

Vec<Base, MAXVL, SUBVL> __sv_add(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_sub(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_mul(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_muladd(Vec<Base, MAXVL, SUBVL> factor1, Vec<Base,
MAXVL, SUBVL> factor2, Vec<Base, MAXVL, SUBVL> term, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_div(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_mod(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_and(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_or(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_xor(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_shl(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
Vec<Base, MAXVL, SUBVL> __sv_shr(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
...fill more alu ops in
Vec<Base, MAXVL, SUBVL> __sv_saturating_add(Vec<Base, MAXVL, SUBVL> lhs,
Vec<Base, MAXVL, SUBVL> rhs, vl_t vl, mask_t mask);
...fill more alu ops in
mask_t __sv_compare_eq(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_ne(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_gt(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_lt(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_ge(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_le(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL, SUBVL>
rhs, vl_t vl, mask_t mask);

// floating-point compares with opposite results on NaNs
mask_t __sv_compare_eq_unordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_ne_ordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_gt_unordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_lt_unordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_ge_unordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);
mask_t __sv_compare_le_unordered(Vec<Base, MAXVL, SUBVL> lhs, Vec<Base, MAXVL,
SUBVL> rhs, vl_t vl, mask_t mask);

// __sv_merge_tail returns a vector with elements <= vl and mask bit set copied
from head and elements > vl and/or mask bit clear copied from tail, this is
done by copying/generating head to the registers holding tail. This will
usually compile to zero additional instructions because it can be merged with
the instruction computing head.
Vec<Base, MAXVL, SUBVL> __sv_merge_tail(Vec<Base, MAXVL, SUBVL> head, Vec<Base,
MAXVL, SUBVL> tail, vl_t vl, mask_t mask);

// this can usually be merged with following instructions by using scalar
instruction arguments
Vec<Base, MAXVL, SUBVL> __sv_splat(Vec<Base, 1, SUBVL> subvec, vl_t vl, mask_t
mask);
Vec<Base, MAXVL, 1> __sv_splat(Base element, vl_t vl, mask_t mask);

// can usually be merged with other instructions
Vec<Base, MAXVL, SUBVL> __sv_twin_pred(Vec<Base, MAXVL, SUBVL> src, vl_t vl,
mask_t src_mask, mask_t dest_mask);

// swizzle0-3 are compile-time constants
Vec<Base, MAXVL, 1> __sv_swizzle(Vec<Base, MAXVL, SRC_SUBVL> src, vl_t vl,
mask_t mask, int swizzle0);
Vec<Base, MAXVL, 2> __sv_swizzle(Vec<Base, MAXVL, SRC_SUBVL> src, vl_t vl,
mask_t mask, int swizzle0, int swizzle1);
Vec<Base, MAXVL, 3> __sv_swizzle(Vec<Base, MAXVL, SRC_SUBVL> src, vl_t vl,
mask_t mask, int swizzle0, int swizzle1, int swizzle2);
Vec<Base, MAXVL, 4> __sv_swizzle(Vec<Base, MAXVL, SRC_SUBVL> src, vl_t vl,
mask_t mask, int swizzle0, int swizzle1, int swizzle2, int swizzle3);

todo: add load/stores/mv.x/etc.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list