Study on upper limit of sample sizes for a two-level test in NIST SP800-22

2019 
NIST SP800-22 is one of the widely used statistical testing tools for pseudorandom number generators (PRNGs). This tool consists of 15 tests (one-level tests) and two additional tests (two-level tests). Each of one-level tests provides one or more $p$-values. The two-level tests measure the uniformity of the obtained $p$-values for a fixed one-level test. One of the two-level tests is to categorize the $p$-values into ten intervals of equal length, and apply a chi-squared goodness-of-fit test.This two-level test is often more powerful than one-level tests, but sometimes it rejects even good PRNGs when the sample size at the second level is too large, since it detects approximation errors in the computation of $p$-values. In this paper, we propose a practical upper limit of the sample size in this two-level test, for each of six tests appeared in SP800-22. These upper limits are derived by the chi-squared discrepancy between the distribution of the approximated $p$-values and the uniform distribution $U(0, 1)$. We also computed a "risky" sample size at the second level for each one-level test. Experiments show that the two-level test with the proposed upper limit gives appropriate results, while using the risky size often rejects even good PRNGs. We also propose another improvement: to use the exact probability for the ten categories in the computation of goodness-of-fit at the two-level test. This allows us to increase the sample size at the second level, and would make the test more sensitive than the NIST's recommending usage.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    0
    Citations
    NaN
    KQI
    []