[Libre-soc-dev] NGI POINTER gigabit ethernet router ASIC roadmap

lkcl luke.leighton at gmail.com
Mon Nov 1 11:11:20 GMT 2021

(Staf, summary, i know you are busy with deadlines: no real change here for you)

there is not much time to get a lot done and not enough full time committed people to do it.  therefore some tough decisions need to be made to limit what goes into the ASIC.

we have 5-6 months to get the entire HDL ready and simulated for the gigabit router ASIC.  layout will have to start around month 4-5 into that, with preliminary HDL.  total available time including the MPW: ELEVEN months.

that is all.

bare minimum features:

* bitmanipulation ALUs
* reasonable performance (i.e. not 1 instruction per 10 cycles as we have with TestIssuer)
* RGM-II ethernet interfaces (5 for preference)
* UTMI / ULPI interfaces (2 for preference)
* UART (16550 for preference)
* GPIO (plain and EINT)
* DMA Engine

everything else is unimportant.

* there is not enough time to do out-of-order (it was 5 months just doing the 6600 experiment), a FSM is too slow therefore we do an in-order core making sure to plan ahead for OoO (use ReservationStations, use MultiCompUnits)

* SVP64 again has to be cut. SimdSignal is not ready in time, elwidth overrides are not ready in time (simulator or HDL) and an Issuer for SVP64 is particularly complex: that was learned from TestIssuer. it also greatly increases the number of regfile ports needed.

* the FP unit again has to be cut.  it is not important for a router ASIC.  too much adaptation / completion is needed to make it Power ISA 3.0 compliant

Staf: if you can still do regfiles with byte-level write-enable this will save time when we get to SVP64 (which if we get the other EU funding will be sooner rather than later)

Jean-Paul: when we did DFFs/SFFs for regfiles they were distributed extremely optimally right the way through the layout.  the actual ALUs were distributed as well, and it was both very cool and well routed.

but, for fixed size (compact) regfiles that are basically SRAM blocks, this is a very different scenario, and having those SRAMs (regfiles) up in the top row is going to cause huge routing congestion.  something to think about.

Dimitry: if you can do the bitmanip instructions in the Simulator (convert verilog reference code to Power ISA pseudocode) that would be very helpful.  this is under a new (approved) NLnet grant, we can add you to Libre-SOC team, i will take care of 3mdeb.

Jacob: if you can keep an eye out and coordinate with Dimitry on the pseudocode / simulator, it would be good if you could do the HDL for the bitmanip instructions.  you will need to work together on shared unit tests.

note: *please do not compromise performance* or waste any extra design time which we do not have by attempting to force the instructions to be constant time.  constant time for the bitmanip instructions is NOT on the requirements, i have made this very clear in the NLnet Grant. there is NO user GUI or console interaction, it is a NETWORKED device not a Graphical Interface device, and therefore a timer may be added to delay network packets (part of the DMA Engine) which achieves the same end result.

Andrey: we need a pinmux and peripheral autogenerator.  it is planned 2+ years ago and the frontend is already done, it generates CSV/JSON files. litex is NOT going to be used, Florent has had 3 strikes (3 opportunities).  instead we use nmigen-soc and CSR auto-allocation, all dynamically allocated.  integration with fuse-soc and nmigen is apparently underway: this *may* prove useful but we have to see how it goes.

Varun: i am waiting to hear back from you if you would like to do your MSc on the peripherals.  as i wrote to you i have been able to track down the RGMII code by Harry Ho: he completed Eth Rx but had not completed Eth Tx in nmigen. it is only around 80 lines of nmigen HDL to add that functionality and we have what we need.  also there is a UTMI/ULPI interface already written which i tracked down.  testing these with cocotb and other unit tests and so on will be important.

Tobias: you and i need to do the MMU and memory infrastructure.  this is really important to have, because we need to be running OpenWRT with an MMU.

Cesar: if you and i can work on the In-Order core, in particular if you can take care of the Issuer, i will pull together the main pieces you will need, make small adjustments to Core, create ReservationStation FUs.

Kyle: if you can add options to be able to run TestIssuer or the new InOrderIssuer (when it is ready) so that comparisons to make sure the new HDL gets the right answers compared to the original, that would be great.

a more "sophisticated" Test API would be to have "breakpoints".  one of the issues we will run into is that the in-order core will have overlapping instructions, because of pipelining.   but, of course, if single-stepping there *will* be no overlap, and there might be overlap bugs.

therefore, the option is needed to allow a *group* of instructions to run, and to take a snapshot State afer a  *group* has run.

basically, breakpoints, just like in gdb.

that's mostly it.  any questions.


More information about the Libre-soc-dev mailing list