[libre-riscv-dev] [OpenPOWER-HDL-Cores] system call (sc) LEV "reserved field"
hugh at blemings.org
Mon Jul 27 10:07:57 BST 2020
Apologies for the top post but I can offer a historical perspective on
at least part of this I think.
Before I do though a disclaimer - these are my opinions and
recollections only, certainly not policy! :)
While it may change over time, the focus for OpenPOWER in the embedded
space with the Open ISA has been firmly in the mid to high end and
therefore mostly 64 bit space.
So to borrow some of the terminology/examples below, a SoC for a
Raspberry Pi style device based on OpenPOWER ? Sure, definitely an area
Getting down into smaller devices - really resource constrained and/or
native 32 bit only - "truly embedded" - that's an area well served by
RISC-V and not, I think somewhere we're going to see focus for OpenPOWER
Not to say it can't in the future, but I think there is a bias towards
working in areas the ecosystem as it currently stands knows well -
playing to current strengths. :)
That said I may still do an OpenPOWER based NTP enabled 7-segment
display bedside clock one day, but that will be born of stubbornness,
not because I actually need a 64 bit CPU to do it ;)
On 27/7/20 6:00 pm, Luke Kenneth Casson Leighton wrote:
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
> On Mon, Jul 27, 2020 at 5:30 AM Benjamin Herrenschmidt
> <benh at kernel.crashing.org> wrote:
>>> although i have to ask why, for Embedded, they did not just recompile
>>> the source code, customised for that enduser application.
>> Because the distinction between "embedded" and "unix" is very blurry
> benjamin: this is not good! specifications should *never* be blurry!
> i believe you may mean something different, here, which warrants
> investigation. i think i know why, given that the typical markets for
> embedded PowerISA cores are specialist areas (aerospace and so on).
> Quorl for example is marketed at routers and definitely crosses over,
> yet is still termed "embedded".
>> and in many areas obsolete.
> ok let's look at the v3.1B document, page viii "Compliancy Subsets".
> nowhere in there - even in the scalar fixed-point section - am i
> seeing mention of what would be expected for "really resource
> constrained i.e. *truly* embedded" markets: being able to drop the
> (very expensive, gate-wise) logic decoding for catching illegal
> instructions, illegal SPR access, and so on.
> in SFS the drop in gate count from not needing a FPU is massive: an
> FPU typically dwarfs the size of the main core. once the FPU is
> dropped then relatively speaking further savings such as cutting those
> needed for illegal instructions becomes signficant (5% saving, 10%
> saving) and in mass-volume markets that's absolutely massive.
> let us therefore define "resource-constrained embedded markets" as
> "truly" embedded, rather than the blurry definition inspired by the
> meme that goes by the moniker "IoT", which now includes 8+ watt
> raspberry pi 4 devices that run so hot they need fan cooling.
> whilst it comes with the "burden" of needing to snapshot and maintain
> the full toolchain (which for customisation and custom extensions is a
> hard requirement anyway), one of the key areas in which RISC-V has
> been successful is the mass-volume *truly* embedded market.
> the canonical example is Western Digital, who up until RISC-V had
> their own "truly" embedded custom ISA which has been hyper-efficient
> for them, for use in SSDs, HDDs and USB Flash devices. the reason
> they wanted to go with RISC-v is to save on maintenance of their own
> custom toolchain (bear in mind that with a custom ISA you need not
> only a custom toolchain, you need a custom OS and custom applications
> as well! the cost to them of maintaining this would be enormous!)
> here, if the binary size is large, this cuts into the usage allocation
> of the on-board NAND and on-board RAM (because they don't have
> separate NAND from the customer-allocated area, or separate RAM for
> the cacheing of customer's data).
> if the NAND and RAM allocation are reduced, that has a very real
> detrimental impact on their sales and profitability, especially in
> such a highly competitive market!
> remember that for WD we are talking sales of billions of units, here.
> "a few gates" multiplied by a billion sales can make the difference
> between profit and loss.
> so they found when converting to RISC-V from their own internal ISA
> that binary size increased by (iirc) 20%. they absolutely had to do
> something about this and so set about analysing static instruction
> allocation and (i am projecting here as this was 2016) they would by
> now likely have used that research to create custom instruction
> modifications similar to the VLE Book, except targetted specifically
> at their use-case.
> none of these custom ISA modifications which result in code-size
> reduction that is important to their profitability in a very real way
> afaik are made public (i have not checked or heard any news so this
> could be wrong).
> this was all based on the initial base of *not* having to have the
> gates for illegal instruction catching etc., they had to go "custom
> firmware, custom toolchain" anyway, and the RISC-V initiative provided
> them with a huge cost saving on the toolchain and much more.
> so they got to have their cake and eat it:
> * cost savings on toolchain, kernel, os and bootloader maintenance (by
> being able to ride off the back of RISC-V "official")
> * cost savings from the "Embedded RISC-V Platform" already having made
> decisions that significantly reduced gate count
> * cost savings from further reductions by adding custom instructions
> and customising the toolchain to match them.
> *this* is *truly* embedded.
>> One can compile a single image that is meant to run on a wide variety
>> of system and even the "embedded" world wants that capability.
> if we are talking about the existing (core) OpenPOWER Foundation
> members subset embedded markets, served by their current product
> offerings: yes.
> if we are talking about the *full* (world-wide) definition of
> "embedded" which includes "truly" embedded: absolutely not.
> and i have to point out that if OpenPOWER's direction only takes into
> consideration the former, it is *guaranteed* that Power will never
> extend or see wide adoption into the latter. the cost-benefit
> analysis comes up so short that no mass-volume product manufacturer
> will consider it, and quite rightly so, as things stand.
> this then is the challenge for OpenPOWER if it wishes to have a wider
> reach: to adapt to beyond the needs of the current members, whilst
> also - very importantly - respecting the long-standing relationship,
> contribution and needs *of* those current members at the same time.
> it's a delicate balance to achieve.
>> because they end up running some kind of "upstream" OS image, or
>> because they don't want to maintain completely different SW images for
>> all their products, etc...
> indeed. and this is perfectly reasonable, for what we've colloquially
> termed the "blurry" embedded markets. the cost savings of not having
> to recompile or maintain custom packages etc. - these are enormous
> savings, absolutely worthwhile pursuing.
> unfortunately, if that expectation then propagates throughout the
> entire OpenPOWER community as a "hard expectation" (even to the SFS
> Compliancy subset), it *automatically and inherently* excludes any
> possibility for "truly" embedded vendors to consider using POWER,
> because they are *prohibited* by the Compliancy Requirements from
> dropping the gates that would make the product profitable!
>>> however this also actually illustrates precisely why i mentioned that
>>> for best results, a spec has to have different platform behaviour for
>>> Embedded as completely separate and distinct from UNIX.
>> This is not really true anymore.
> thanks to the meme "IoT" and so on the word "embedded" when in general
> circulation has become meaningless, yes. i am not using the term
> "Embedded" in the meaningless sense, i am using it in the original
> sense used to describe 8 and 16 bit processor markets (when uprated to
> 32 and sometimes 64 bit).
> Embedded in the sense of Arduino PICs, ATMEL ATSAM3 series, ST Micro
> STM32F series, and so on. ARM Cortex M0, M3 style and such.
> not the "take the latest 64 bit high performance 2.5 ghz quad-core 8+
> watt processor, slap it into a SBC form-factor and call it quotes
> embedded quotes" definition of Embedded.
>> For example, do you consider your cell
>> phone or your TV "embedded" or "unix" ?
> UNIX, without a shadow of doubt. Android is a UNIX Platform. the
> Android kernel *is* the linux kernel, and, if end-users are prepared
> to put in some effort, all devices running Android - if there isn't
> Treacherous DRM built-in to the boot sequence - can have their OS
> entirely replaced by any GNU/Linux distro that's compatible with the
> so no - those are *definitely* UNIX platform devices.
>>> this on the basis that Embedded Markets are typically one-off
>>> deployment, where the toolchain and all build source is "snapshotted"
>>> at product release time, used internally and the source code and
>>> toolchain almost never see the light of day. mainline upgrades are
>>> exceptionally rare.
>> There are quite a few counter examples even in the "embedded" market.
>> Especially when it comes to storage appliances. Some of these things
>> can even run CentOS.
> however, again: are we referring to "general morphed (blurry)"
> definition of Embedded, or "truly" embedded?
>>> so in that context, i am slightly confused to hear that for Freescale
>>> *Embedded* processors that there is even a binary incompatibility
>>> problem with lwsync *at all*.
>>> if you have time i'd love to hear more, what's the story there?
>> I forgot the details, but a specific core variant from FSL screwed up
>> the decode table and would trap on lwsync instead of ignoring the bit.
> and, given the expectation to have binary interoperability even across
> "general" (aka colloquially blurry) Embedded, that would matter (a
>>> indeed... in this very special case that we have established, by it
>>> effectively being a "hint", falls completely outside of what i am
>>> concerned about
>> In *most* cases trap + emulation leads to unusable performances though.
> this is another sub-topic entirely. the question one needs to ask is:
> *why* is performance unusable? *why* is the cost so high?
> RISC-V (again. there are many good bits to RISC-V) provides
> hardware-level partial-decode of instruction sub-fields. RA, RB, RS,
> RT (or, RISC-V equivalents) and other sub-fields, these are all
> available via individual SPRs (CSRs in RISC-V terminology) in the
> "illegal instruction" trap, saving huge numbers of mask/shift
> operations that make trap-and-emulate much faster and require far less
> without such hardware-level assistance yes i can see that trap and
> emulate would be considered completely unacceptable.
> we've diverged quite a lot from the original topic :) but also i
> believe provided some insights into the very different needs of
> different markets, which pull in completely polar opposite directions
> on the same ISA.
> the summary is: the Power Spec is a valued and valuable reflection of
> the needs of its community. the question becomes: does that community
> want to see OpenPOWER extend further? if so, it needs to learn from
> the analysis done by the RISC-V founders, who analysed *30 years* of
> RISC processor architectures and amalgamated the best bits that they
> thought would make a modern ISA *and eco-system* supporting a hugely
> diverse range of markets, from tiny ("truly") embedded all the way up
> to supercomputers.
> OpenPOWER-HDL-Cores mailing list
> OpenPOWER-HDL-Cores at mailinglist.openpowerfoundation.org
More information about the libre-riscv-dev