language-icon Old Web
English
Sign In

Fisher information

In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior (according to the Bernstein–von Mises theorem, which was anticipated by Laplace for exponential families). The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher (following some initial results by Francis Ysidro Edgeworth). The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.One advantage Kullback-Leibler information has over Fisher information is that it is not affected by changes in parameterization. Another advantage is that Kullback-Leibler information can be used even if the distributions under consideration are not all members of a parametric family. In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior (according to the Bernstein–von Mises theorem, which was anticipated by Laplace for exponential families). The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher (following some initial results by Francis Ysidro Edgeworth). The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics. The Fisher-information matrix is used to calculate the covariance matrices associated with maximum-likelihood estimates. It can also be used in the formulation of test statistics, such as the Wald test. Statistical systems of a scientific nature (physical, biological, etc.) whose likelihood functions obey shift invariance have been shown to obey maximum Fisher information. The level of the maximum depends upon the nature of the system constraints. The Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ upon which the probability of X depends. Let f(X; θ) be the probability density function (or probability mass function) for X conditional on the value of θ. It describes the probability that we observe a given outcome of X, given a known value of θ. If f is sharply peaked with respect to changes in θ, it is easy to indicate the “correct” value of θ from the data, or equivalently, that the data X provides a lot of information about the parameter θ. If the likelihood f is flat and spread-out, then it would take many samples like of X to estimate the actual “true” value of θ that would be obtained using the entire population being sampled. This suggests studying some kind of variance with respect to θ. Formally, the partial derivative with respect to θ of the natural logarithm of the likelihood function is called the “score”. Under certain regularity conditions, if θ is the true parameter (i.e. X is actually distributed as f(X; θ)), it can be shown that the expected value (the first moment) of the score is 0: The variance of the score is defined to be the Fisher information: Note that 0 ≤ I ( θ ) {displaystyle 0leq {mathcal {I}}( heta )} . A random variable carrying high Fisher information implies that the absolute value of the score is often high. The Fisher information is not a function of a particular observation, as the random variable X has been averaged out. If log f(x; θ) is twice differentiable with respect to θ, and under certain regularity conditions, then the Fisher information may also be written as

[ "Applied mathematics", "Statistics", "Mathematical optimization", "Econometrics", "Machine learning", "Fisher consistency", "Extreme physical information", "Fisher's method", "Observed information", "Information matrix test" ]
Parent Topic
Child Topic
    No Parent Topic