Unbiased estimation of standard deviation

In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis. In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation (a measure of statistical dispersion) of a population of values, in such a way that the expected value of the calculation equals the true value. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis. However, for statistical theory, it provides an exemplar problem in the context of estimation theory which is both simple to state and for which results cannot be obtained in closed form. It also provides an example where imposing the requirement for unbiased estimation might be seen as just adding inconvenience, with no real benefit. In statistics, the standard deviation of a population of numbers is often estimated from a random sample drawn from the population. This is the sample standard deviation, which is defined by where { x 1 , x 2 , … , x n } {displaystyle {x_{1},x_{2},ldots ,x_{n}}} is the sample (formally, realizations from a random variable X) and x ¯ {displaystyle {overline {x}}} is the sample mean. One way of seeing that this is a biased estimator of the standard deviation of the population is to start from the result that s2 is an unbiased estimator for the variance σ2 of the underlying population if that variance exists and the sample values are drawn independently with replacement. The square root is a nonlinear function, and only linear functions commute with taking the expectation. Since the square root is a strictly concave function, it follows from Jensen's inequality that the square root of the sample variance is an underestimate. The use of n − 1 instead of n in the formula for the sample variance is known as Bessel's correction, which corrects the bias in the estimation of the population variance, and some, but not all of the bias in the estimation of the sample standard deviation. It is not possible to find an estimate of the standard deviation which is unbiased for all population distributions, as the bias depends on the particular distribution. Much of the following relates to estimation assuming a normal distribution. When the random variable is normally distributed, a minor correction exists to eliminate the bias. To derive the correction, note that for normally distributed X, Cochran's theorem implies that ( n − 1 ) s 2 / σ 2 {displaystyle (n-1)s^{2}/sigma ^{2}} has a chi square distribution with n − 1 degrees of freedom and thus its square root, n − 1 s / σ {displaystyle {sqrt {n-1}}s/sigma } has a chi distribution with n − 1 degrees of freedom. Consequently, calculating the expectation of this last expression and rearranging constants, where the correction factor c4(n) is the scale mean of the chi distribution with n − 1 degrees of freedom, μ 1 ( n − 1 ) / n − 1 . {displaystyle mu _{1}(n-1)/{sqrt {n-1}}.} This depends on the sample size n, and is given as follows:

[ "Efficient estimator", "Bias of an estimator", "U-statistic" ]
Parent Topic
Child Topic
    No Parent Topic