[Libre-soc-dev] [OpenPOWER-HDL-Cores] microwatt grows up LCA2021
Luke Kenneth Casson Leighton
lkcl at lkcl.net
Tue Feb 9 10:48:38 GMT 2021
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
On Tue, Feb 9, 2021 at 3:28 AM Paul Mackerras <paulus at ozlabs.org> wrote:
>
> On Mon, Feb 08, 2021 at 10:42:16AM +0000, Luke Kenneth Casson Leighton wrote:
>
> > CSRC NIST.gov's STS. except do not bother with the Lempel-Zive test
> > because it is flawed. i explain why in the link betrusted-soc.
>
> I found sts-2_1_2.zip on the NIST web site, along with the paper that
> describes it. There were some gcc warnings when compiling it, and the
> "Universal Statistical" test always produces p=0 no matter what set of
> numbers I give it, so I suspect gcc has exercised its prerogative to
> break your program if it has any non-compliance with the C standard.
yeah as i mentioned in the previous message, i was running this on
32-bit systems, probably using gcc 3
> When you were using STS, did you have any framework to run it
> automatically and generate summaries, or did you just run it manually
> and look at the finalAnalysisReport.txt file?
i just looked at the finalAnalysisReport.txt
what i did do however was "beowulf cluster" the thing. i think i
split it down into two programs:
1) that ran the individual tests (produced the output which you see in
some subdirectories)
2) that then analysed that output
in this way what i could do is, if i happened to run 2 batches of
1,000 100k bit-runs i could merge them together (by hand) and run
stage (2) to get better analysis.
> > btw one very important thing, it may be worthwhile to coordinate with
> > Bunnie Huang regarding the FAILURE runs here:
> >
> > https://github.com/betrusted-io/betrusted-wiki/wiki/TRNG-characterization
> >
> > the marsaglia tests in particular have me concerned, they fail twice.
>
> I'm running dieharder again now, and I got:
>
> marsaglia_tsang_gcd| 0| 10000000| 100|0.09468860| PASSED
> marsaglia_tsang_gcd| 0| 10000000| 100|0.72404443| PASSED
>
> That's just one data point of course, but there doesn't seem to be an
> immediate indication of a problem here.
exactly, and that's the problem: dieharder doesn't do the same type of
test-of-test-of-tests that STS does.
remember: statistically speaking, failures (outliers) *are* going to
occur. it's the *number* of failures/outliers that diehard and
dieharder *are not telling you about*, but STS does.
basically that marsaglia_tsang_gcd test needs to be:
* ported into STS
* run 1,000 times, indepdendently.
* have the same test-of-test-of-tests histogram analysis run on it
[that diehard/er is NOT doing]
*only then* can you be confident that - after running it not once, not
twice, not three times, but a THOUSAND times, and performing a
RIGOROUS statistical analysis of the results - that it's okay
right now, saying "doesn't seem to be an indicator of a problem"
fundamentally misses this abbbbsolutely critical point that it is the
*test of the tests* that you need to pay attention to.
to get the same type of analysis that is missing from diehard/er, what
you will have to do is:
* run diehard/er MANUALLY 1,000 times
* note the p-values
* go look up some mathematical papers on statistical analysis
* MANUALLY write your own histogram testing program analysing the 1,000 p-values
*only* then will you have achieved on-par confidence testing that STS provides.
l.
More information about the Libre-soc-dev
mailing list