[Libre-soc-dev] [OpenPOWER-HDL-Cores] mulhw / mulhwu in microwatt updates SO
paulus at ozlabs.org
Sat Aug 29 01:05:58 BST 2020
On Fri, Aug 28, 2020 at 09:28:53AM +0100, Luke Kenneth Casson Leighton wrote:
> On Friday, August 28, 2020, Paul Mackerras <paulus at ozlabs.org> wrote:
> > On Thu, Aug 27, 2020 at 04:39:21PM +0100, Luke Kenneth Casson Leighton
> > wrote:
> > > in section v3.0B p73 it says that mulhw, mulhw. and hwu do not have an
> > > overflow version. CR0 is also intended to be undefined in bits 0:2 in
> > > 64-bit mode. nothing is mentioned about SO however we might
> > > reasonably assume that because there is no OE version of mulhw/mulhwu,
> > > it must be ignored.
> > What's the "it" in that last sentence?
> OE, sorry.
OK, yes, then it must be ignored as you say.
> > The ISA shows bit 11 (IBM bit 21) of mulhw and mulhwu as a reserved
> > field, whereas that bit is the OE bit for mullw and other
> > instructions.
> > So according to ISA Book 1 section 1.3.3, which says
> > "Reserved fields in instructions are ignored by the processor", we
> > should ignore the OE field for mulhw/mulhwu, and we do (and that is in
> > fact what POWER9 does also).
> ah. ok. that's the thing: if my reading is correct, microwatt doesn't
> ignore it, i checked the source, there's no special-case to ignore OE for
> OP_MULH32, and the decode1.vhdl tree, by having case patterns that run
> OP_MULH32 regardless of how OE is set, leaves an execution path where
> decode2.vhdl (decode_oe) and execute1.vhdl follows the standard generic
> behaviour for OE.
Lines 357--359 of decode2.vhdl:
if not (d_in.decode.insn_type = OP_MUL_H32 or d_in.decode.insn_type = OP_MUL_H64) then
v.e.oe := decode_oe(d_in.decode.rc, d_in.insn);
It's a bit ugly I know, but it didn't seem worth adding another column
to decode_rom_t for it...
> i _did_ have a special case in the libresoc equivalent of decode2.vhdl to
> ignore OE (set it to zero for OP_MULH32 regardless of whether the bit is
> set) however this was what got the "wrong" behaviour when it came to
> creating CR0.
That seems strange...
> if you examine running core_tb on 1.bin with the patch i sent you can see
> that CR is written as if OE is respected (i.e. as if there was a mulhwo)
> and that XER is also updated as if mulhwo was being run.
Excerpts from the log of running 1.bin under core_tb:
register_file.vhdl:63:13:@3425ns:(report note): Reading GPR 01 000000000000003F
register_file.vhdl:66:13:@3425ns:(report note): Reading GPR 05 FFFFFFFFFFFFFFFF
cr_file.vhdl:68:17:@3435ns:(report note): Writing XERC SO 1 CA 1 32 1 OV 1 32 1
execute1.vhdl:591:12:@3435ns:(report note): execute nia 000000000001035C
cr_file.vhdl:64:17:@3445ns:(report note): Writing 50000000 to CR mask 80 500F000F
cr_file.vhdl:68:17:@3445ns:(report note): Writing XERC SO 1 CA 0 32 0 OV 1 32 1
register_file.vhdl:48:20:@3445ns:(report note): Writing GPR 04 0000000000000001
cr_file.vhdl:64:17:@3495ns:(report note): Writing 90000000 to CR mask 80 900F000F
register_file.vhdl:48:20:@3495ns:(report note): Writing GPR 1E FFFFFFFFFFFFFFFF
We're looking at the mulhw. at 1035c (opcode 7fc12c97), right?
Note that the write to XERC at 3445ns is the result of the previous
instruction at 10358, which is "sraw.", which writes XER[CA]. The
mulhw. at 1035c writes back at 3495ns, and we see it writing to CR0
and GPR 1E (r30) but not XER. It writes "5" i.e. binary 0101 to CR0,
so CR0[SO] is being set (correctly, since XER[SO] was set when the
> later i will find the relevant sections of the logs (i am not at my laptop
> at the moment)
> i assumed this was intentional where from your response i see that this may
> instead be a possible bug in microwatt.
I don't see a bug here. XER is not written by 1035c, and CR0[SO] is
written with the value XER[SO] had when 1035c started.
> > > $ powerpc64le-linux-gnu-objdump -D -b binary -m powerpc:common64 -z
> > > /home/lkcl/src/libresoc/microwatt/tests/1.bin > 1.dump
> > Yes, objdump refuses to interpret instructions with non-zero values in
> > reserved fields.
> ahh... this makes sense in the context of the program being randomly
> generated. i did wonder.
> > > manual decode of 0x7fc12c97:
> > >
> > > major 31
> > > minor 75
> > > Oe 1
> > > rt 30
> > > ra 1
> > > rb 5
> > > Rc 1
> > >
> > > it's "mulhw." but actually it's not, it's "mulhwo." because both OE=1
> > > *and* Rc=1 which is an instruction that doesn't exist in v3.0B. (so
> > > why is it in microwatt unit test 1.bin, and where did that unit test
> > > come from?)
> > I believe it was produced by Anton's simple_random program.
> ahh a very good way to explore odd cases, i like it.
> > > put simply: if we "obey" the spec, XER gets corrupted. microwatt is
> > How does XER get corrupted? mulhw/mulhwu with bit 11 set to 1 should
> > behave the same as with bit 11 set to 0, i.e., it won't update XER[SO]
> > or XER[OV] -- and that is how both microwatt and POWER9 behave, and it
> > is in conformance with Book I section 1.3.3.
> should, but my excruciatingly close line by line examination trying to get
> dead accurate identical behaviour would suggest otherwise.
> (if it wasn't for the comparative analysis i would never have noticed).
> i will double-check.
> > > also not "obeying" the spec, and this is cause for some concern /
> > > alarm, because, strictly speaking, we are violating the OpenPOWER EULA
> > > by not following - to the letter - the v3.0B specification.
> > The compliance requirements in the licence are considerably more
> > relaxed for soft cores than for hard cores, fortunately, otherwise
> > open-source development of a soft core wouldn't be possible.
> wheww :)
> and here it seems that the spec is correct.
> we should confirm this by running 0x7fc12c97 on IBM POWER9 hardware (LE).
I did do some experiments on POWER9 in the past with setting reserved
fields in multiply instructions, but I don't remember whether I tested
this precise instruction word.
> > > some... clarity on this would be greatly appreciated. are we
> > > permitted to violate the v3.0B spec (and the OpenPOWER EULA) on the
> > > basis that if we don't we will end up fragmenting interoperability
> > > because IBM's POWER9 implementation does something *different* from
> > > what the v3.0B spec says?
> > You haven't convinced me that POWER9 does something different from
> > v3.0B yet...
> no, i see it may be a possible microwatt bug instead.
Bugs are always possible, of course, but in this case I think we get
More information about the Libre-soc-dev