[Libre-soc-dev] mv.zip and mv.unzip (vector pack and unpack)
lkcl
luke.leighton at gmail.com
Sat Jun 11 10:15:56 BST 2022
a second idea which also cuts down additional instructions that need to be proposed: a MV-RM format. it sacrifices use of EXTRA3 for EXTRA2, freeing up 2 bits that can then be dedicated to a src SUBVL.
# RM Mode Concept:
MVRM-2P-2S1D:
| Field Name | Field bits | Description |
|------------|------------|----------------------------|
| Rdest_EXTRA2 | `10:11` | extends Rdest (R\*\_EXTRA2 Encoding) |
| Rsrc_EXTRA2 | `12:13` | extends Rsrc (R\*\_EXTRA2 Encoding) |
| src_SUBVL | `14:15` | SUBVL for Source |
| MASK_SRC | `16:18` | Execution Mask for Source |
the reason why to have "sv.mv RT.vecN RA.vecN" would be to cover say putting 3 8-bit RGB values into a single 32-bit destination. this would be achievable with elwidth overrides only if it wasn't for vec3. predication will not help there but Swizzle definitely would, i think.
however another possible interpretation would be to interpret it as pack/unpack. "sv.mv/ew=8 RT.v, RA.vec3" would be that all Rs get put into RT, sequentially, first, followed by all Gs and then by all Bs.
that interpretation would be totally blowing up the normal expectations for how SUBVL is supposed to work (as an inner loop) because it would become an outer loop for RA.
could SUBVL be considered conceptually an outer loop "in general"? and only the fact that they are normally equal for src and dest hides that fact? i have a sneaking suspicion that would work.
thoughts appreciated, because i think it quite crucial to keep instruction count down on mvs: abstracting out mv.zip/unzip as a REMAP is an extra 2 32-bit instructions (per mv) whereas a MVRM Format could be built-in to the SVP64 24-bit prefix and cover at least the basics, vec234-vec234.
l.
More information about the Libre-soc-dev
mailing list