[Libre-soc-bugs] [Bug 1155] O(n^2) multiplication REMAP mode(s)

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Sat Dec 30 17:19:53 GMT 2023


https://bugs.libre-soc.org/show_bug.cgi?id=1155

--- Comment #39 from Luke Kenneth Casson Leighton <lkcl at lkcl.net> ---
(In reply to Jacob Lifshay from comment #35)
> some additional thoughts on the multiply algorithm from comment #10:
> 
> we'll want to have an additional instruction afterward that clears the
> temporary values, since this allows the hardware to not have to compute
> them, which may be much more complex than just computing the product
> directly, especially if the hardware uses a different multiplication
> algorithm internally.

(see caveat first... ) ok so again a gentle but firm reminder: we are
effectively designing the hardware-equivalent of a Universal Software API.

how and what decisions hardware implementors make is *not our problem*
as designers of that *Universal* Software API.

illustration: if software developers start measuring SV "performance"
on a processor-by-processor basis from even the same manufacturer and
customise assembler based on differences just like they are forced to do
with SIMD ISAs then we have CATASTROPHICALLY failed, as ISA Designers,
to do our goddamn job correctly.

i therefore remind you *once again* not to make ANY assumptions at
the hardware implementation level that could impact or force a particular
decision at the SOFTWARE level.

now the caveat (which is a reminder of what i repeated before, many times)

we *do* actually have to think through, from the implementor's persective,
if the Software API (aka SV) is actually viable. but - butbutbut - not
from just ONE implementation but ***ALL*** possible implementations.

i stopped you from putting assumptions about hardware in the past, into
a presentation, and am slightly peeved to have to remind you again,
but please *for goodness sake* under *no circumstances* are you to
write a presentation or engage with anyone on the basis of assuming,
or let someone assume, that there shall and will and will ONLY BE just
one and exclusively one hardware implementation...

... or, much much worse, that ALL hardware implementations are FORCED
to have, for example, a JIT macro-op Fusion Hardware Engine.

you are effectively right now telling Embedded Implementors, "you
MUST HAVE a Macro-Op Fusion Engine.  you MUST direct this complex
pattern of SIX INSTRUCTIONS at a dedicated back-end. you have NO CHOICE
in this matter".

yeh?

we have to be so so careful here. i know you said "if", but the
instructions you mention are more of a "Programmer's Note" that
*some* implementations *may* have optimised Macro-Op Fusion.

they will be extremely high-end ones.  spotting and matching a
sequence of six to seven instructions is pretty extreme.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list