[Libre-soc-bugs] [Bug 784] Implement cl* instructions for carry-less operations

bugzilla-daemon at libre-soc.org bugzilla-daemon at libre-soc.org
Thu May 5 09:06:26 BST 2022


https://bugs.libre-soc.org/show_bug.cgi?id=784

--- Comment #38 from Jacob Lifshay <programmerjake at gmail.com> ---
I implemented a more efficient algorithm that has a dynamic shift at the
beginning and end rather than a EqualLeadingZeroCount every step. This reduces
latency to 1 layer of xor gates per step (assuming the step counter is
optimized by yosys to just do 1 add per clock), rather than a full 64-bit adder
per step -- allowing 8 or maybe 16 steps per clock cycle to be reasonable. I
split the shifts out into their own clock cycles at the beginning and end for
the FSM (they overlap with reading inputs and writing outputs), so the xor gate
layers have a full clock cycle to propagate, allowing increasing the number of
xor gates.

I ran the comparison by running:
python src/nmigen_gf/hdl/test/test_cldivrem.py -k test_64_step_8
yosys <<'EOF'                                                   
read_rtlil sim_test_out/__main__.TestCLDivRemFSM.test_64_step_8/0.il
flatten
synth
;;;
stat
EOF


Old algorithm:
https://git.libre-soc.org/?p=nmigen-gf.git;a=commit;h=2b87659b26e3063103274eb21a149a92b664a51a

modified to add 64-bit 8 steps per clock cycle test to TestCLDivRemFSM:
    def test_64_step_8(self):
        self.tst(CLDivRemShape(width=64, n_width=64),
                 full=False,
                 steps_per_clock=8)

   Number of cells:              10743
     $_ANDNOT_                    1960
     $_AND_                        187
     $_MUX_                       3185
     $_NAND_                       457
     $_NOR_                        903
     $_NOT_                        458
     $_ORNOT_                      416
     $_OR_                        1824
     $_SDFF_PP0_                   320
     $_SDFF_PP1_                     1
     $_XNOR_                       469
     $_XOR_                        563

New algorithm:
https://git.libre-soc.org/?p=nmigen-gf.git;a=commit;h=e59b7ccf6c066fc0ecac6410e9b6447d5af77533

   Number of cells:               5046
     $_ANDNOT_                     302
     $_AND_                         67
     $_MUX_                       3283
     $_NAND_                        12
     $_NOR_                         77
     $_NOT_                        323
     $_ORNOT_                       61
     $_OR_                          58
     $_SDFFE_PP0P_                  72
     $_SDFF_PP0_                   197
     $_SDFF_PP1_                     1
     $_XNOR_                       149
     $_XOR_                        444

I don't know of a fast and easy way to get yosys to output latency numbers, so
I'm not testing that.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the libre-soc-bugs mailing list