Central limit theorem

In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a 'bell curve') even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions.Lindeberg–Lévy CLT. Suppose {X1, X2, …} is a sequence of i.i.d. random variables with E = µ and Var = σ2 < ∞. Then as n approaches infinity, the random variables √n(Sn − µ) converge in distribution to a normal N(0,σ2):Lyapunov CLT. Suppose {X1, X2, …} is a sequence of independent random variables, each with finite expected value μi and variance σ2i. DefineTheorem. Let X 1 , … , X n {displaystyle X_{1},dots ,X_{n}} be independent R d {displaystyle R^{d}} -valued random vectors, each having mean zero. Write S = ∑ i = 1 n X i {displaystyle S=sum _{i=1}^{n}X_{i}} and assume Σ = Cov ⁡ [ S ] {displaystyle Sigma =operatorname {Cov} } is invertible. Let Z ∼ N ( 0 , Σ ) {displaystyle Zsim N(0,Sigma )} be a d {displaystyle d} -dimensional Gaussian with the same mean and covariance matrix as S {displaystyle S} . Then for all convex sets U ⊆ R d {displaystyle Usubseteq R^{d}} ,Theorem. Suppose that X1, X2, … is stationary and α-mixing with αn = O(n−5) and that E(Xn) = 0 and E(X12n) < ∞. Denote Sn = X1 + … + Xn, then the limitTheorem. Let a martingale Mn satisfyLemma. Suppose X 1 , X 2 , … {displaystyle X_{1},X_{2},dots } is a sequence of real-valued and strictly stationary random variables with E ( X i ) = 0 {displaystyle mathbb {E} (X_{i})=0} for all i {displaystyle i} , g : [ 0 , 1 ] → R {displaystyle g: ightarrow mathbb {R} } , and S n = ∑ i = 1 n g ( i n ) X i {displaystyle S_{n}=sum _{i=1}^{n}g({ frac {i}{n}})X_{i}} . ConstructTheorem. There exists a sequence εn ↓ 0 for which the following holds. Let n ≥ 1, and let random variables X1, …, Xn have a log-concave joint density f such that f(x1, …, xn) = f(|x1|, …, |xn|) for all x1, …, xn, and E(X2k) = 1 for all k = 1, …, n. Then the distribution ofTheorem. Let X1, …, Xn satisfy the assumptions of the previous theorem, then Theorem (Salem–Zygmund): Let U be a random variable distributed uniformly on (0,2π), and Xk = rk cos(nkU + ak), whereTheorem: Let A1, …, An be independent random points on the plane ℝ2 each having the two-dimensional standard normal distribution. Let Kn be the convex hull of these points, and Xn the area of Kn ThenTheorem. Let M be a random orthogonal n × n matrix distributed uniformly, and A a fixed n × n matrix such that tr(AA*) = n, and let X = tr(AM). Then the distribution of X is close to N(0,1) in the total variation metric up to 2√3/n − 1.Theorem. Let random variables X1, X2, … ∈ L2(Ω) be such that Xn → 0 weakly in L2(Ω) and Xn → 1 weakly in L1(Ω). Then there exist integers n1 < n2 < … such thatThe central limit theorem has an interesting history. The first version of this theorem was postulated by the French-born mathematician Abraham de Moivre who, in a remarkable article published in 1733, used the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. This finding was far ahead of its time, and was nearly forgotten until the famous French mathematician Pierre-Simon Laplace rescued it from obscurity in his monumental work Théorie analytique des probabilités, which was published in 1812. Laplace expanded De Moivre's finding by approximating the binomial distribution with the normal distribution. But as with De Moivre, Laplace's finding received little attention in his own time. It was not until the nineteenth century was at an end that the importance of the central limit theorem was discerned, when, in 1901, Russian mathematician Aleksandr Lyapunov defined it in general terms and proved precisely how it worked mathematically. Nowadays, the central limit theorem is considered to be the unofficial sovereign of probability theory.I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the 'Law of Frequency of Error'. The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement, amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy, the more perfect is its sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along.The occurrence of the Gaussian probability density 1 = e−x2 in repeated experiments, in errors of measurements, which result in the combination of very many and very small elementary errors, in diffusion processes etc., can be explained, as is well-known, by the very same limit theorem, which plays a central role in the calculus of probability. The actual discoverer of this limit theorem is to be named Laplace; it is likely that its rigorous proof was first given by Tschebyscheff and its sharpest formulation can be found, as far as I am aware of, in an article by Liapounoff. ... In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a 'bell curve') even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. For example, suppose that a sample is obtained containing a large number of observations, each observation being randomly generated in a way that does not depend on the values of the other observations, and that the arithmetic mean of the observed values is computed. If this procedure is performed many times, the central limit theorem says that the distribution of the average will be closely approximated by a normal distribution. A simple example of this is that if one flips a coin many times the probability of getting a given number of heads in a series of flips will approach a normal curve, with mean equal to half the total number of flips in each series. (In the limit of an infinite number of flips, it will equal a normal curve.) The central limit theorem has a number of variants. In its common form, the random variables must be identically distributed. In variants, convergence of the mean to the normal distribution also occurs for non-identical distributions or for non-independent observations, given that they comply with certain conditions. The earliest version of this theorem, that the normal distribution may be used as an approximation to the binomial distribution, is now known as the de Moivre–Laplace theorem. In more general usage, a central limit theorem is any of a set of weak-convergence theorems in probability theory. They all express the fact that a sum of many independent and identically distributed (i.i.d.) random variables, or alternatively, random variables with specific types of dependence, will tend to be distributed according to one of a small set of attractor distributions. When the variance of the i.i.d. variables is finite, the attractor distribution is the normal distribution. In contrast, the sum of a number of i.i.d. random variables with power law tail distributions decreasing as |x|−α − 1 where 0 < α < 2 (and therefore having infinite variance) will tend to an alpha-stable distribution with stability parameter (or index of stability) of α as the number of variables grows. Let {X1, …, Xn} be a random sample of size n—that is, a sequence of independent and identically distributed (i.i.d.) random variables drawn from a distribution of expected value given by µ and finite variance given by σ2. Suppose we are interested in the sample average of these random variables. By the law of large numbers, the sample averages converge in probability and almost surely to the expected value µ as n → ∞. The classical central limit theorem describes the size and the distributional form of the stochastic fluctuations around the deterministic number µ during this convergence. More precisely, it states that as n gets larger, the distribution of the difference between the sample average Sn and its limit µ, when multiplied by the factor √n (that is √n(Sn − µ)), approximates the normal distribution with mean 0 and variance σ2. For large enough n, the distribution of Sn is close to the normal distribution with mean µ and variance σ2/n. The usefulness of the theorem is that the distribution of √n(Sn − µ) approaches normality regardless of the shape of the distribution of the individual Xi. Formally, the theorem can be stated as follows: In the case σ > 0, convergence in distribution means that the cumulative distribution functions of √n(Sn − µ) converge pointwise to the cdf of the N(0, σ2) distribution: for every real number z, where Φ(z) is the standard normal cdf evaluated at z. Note that the convergence is uniform in z in the sense that

Parent Topic

Child Topic

No Parent Topic