[Libre-soc-isa] [Bug 567] New: Allow transparent scalar loads and stores to/from registers allocated as vectors
bugzilla-daemon at libre-soc.org
bugzilla-daemon at libre-soc.org
Tue Jan 5 14:17:06 GMT 2021
https://bugs.libre-soc.org/show_bug.cgi?id=567
Bug ID: 567
Summary: Allow transparent scalar loads and stores to/from
registers allocated as vectors
Product: Libre-SOC's first SoC
Version: unspecified
Hardware: PC
OS: Linux
Status: DEFERRED
Severity: enhancement
Priority: ---
Component: Specification
Assignee: cestrauss at gmail.com
Reporter: cestrauss at gmail.com
CC: libre-soc-isa at lists.libre-soc.org
NLnet milestone: ---
First, note that I'm preemptively marking this as deferred, as I understand
that we have a kind of feature-freeze right now.
Also, apologies to Alexander, this is actually his proposal, in the way I see
it. Which is orthogonal to Jacob's proposal, that has to do with bitcast.
So, as I see from the NEON article, a 64-bit register can be partitioned into a
certain number of "lanes" of varying widths. For instance:
8 x 8-bit lanes: L[0] L[1] L[2] L[3] L[4] L[5] L[6] L[7]
4 x 16-bit lanes: L[0] L[1] L[2] L[3]
2 x 32-bit lanes: L[0] L[1]
1 x 64-bit lane: L[0]
There must be a mapping, of each lane, to a range of bits within the register.
We can choose arbitrarily, as long as we are consistent. One choice is:
8-bit:
L[0] -> bits 0 to 7
L[1] -> bits 8 to 15
L[2] -> bits 16 to 23
L[3] -> bits 24 to 31
L[4] -> bits 32 to 39
L[5] -> bits 40 to 47
L[6] -> bits 48 to 55
L[7] -> bits 56 to 63
16-bit:
L[0] -> bits 0 to 15
L[1] -> bits 16 to 31
L[2] -> bits 32 to 47
L[3] -> bits 48 to 63
32-bit:
L[0] -> bits 0 to 31
L[1] -> bits 32 to 47
64-bit:
L[0] -> bits 0 to 63
Another choice is:
8-bit:
L[0] -> bits 56 to 63
L[1] -> bits 48 to 55
L[2] -> bits 40 to 47
L[3] -> bits 32 to 39
L[4] -> bits 24 to 31
L[5] -> bits 16 to 23
L[6] -> bits 8 to 15
L[7] -> bits 0 to 7
16-bit:
L[0] -> bits 48 to 63
L[1] -> bits 32 to 47
L[2] -> bits 16 to 31
L[3] -> bits 0 to 15
32-bit:
L[0] -> bits 32 to 47
L[1] -> bits 0 to 31
64-bit:
L[0] -> bits 0 to 63
Notice, that it's just bit allocations. There isn't any endianess involved, up
to now. It only affects the labeling of the "lane write enable" wires for
writing, and the "lane valid" wires for reading.
What Alex is proposing, I think, is dynamically switching between the mappings,
so that when using a 64-bit scalar load instruction, L[0]=V[0], L[1]=V[1], and
so on, irrespective of memory endianess. Correct?
I think this relabeling can be done with a single crossbar on each read port of
just the 8-bit predicate mask register, no need to shuffle the actual 64-bit
register contents. The partitioned ALUs do not care about which lane number is
assigned to each partition number, as long as the predicate mask is correct.
Also, this was only for 64-bit load and stores, I wonder how 32-bit, 16-bit and
8-bit (or even 24-bit, 40-bit, 48-bit, 56-bit) scalar load/stores would work.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libre-SOC-ISA
mailing list