language-icon Old Web
English
Sign In

Marginal likelihood

In statistics, a marginal likelihood function, or integrated likelihood, is a likelihood function in which some parameter variables have been marginalized. In the context of Bayesian statistics, it may also be referred to as the evidence or model evidence. In statistics, a marginal likelihood function, or integrated likelihood, is a likelihood function in which some parameter variables have been marginalized. In the context of Bayesian statistics, it may also be referred to as the evidence or model evidence. Given a set of independent identically distributed data points X = ( x 1 , … , x n ) , {displaystyle mathbf {X} =(x_{1},ldots ,x_{n}),} where x i ∼ p ( x i | θ ) {displaystyle x_{i}sim p(x_{i}| heta )} according to some probability distribution parameterized by θ {displaystyle heta } , where θ {displaystyle heta } itself is a random variable described by a distribution, i.e. θ ∼ p ( θ | α ) , {displaystyle heta sim p( heta |alpha ),} the marginal likelihood in general asks what the probability p ( X | α ) {displaystyle p(mathbf {X} |alpha )} is, where θ {displaystyle heta } has been marginalized out (integrated out): The above definition is phrased in the context of Bayesian statistics. In classical (frequentist) statistics, the concept of marginal likelihood occurs instead in the context of a joint parameter θ = ( ψ , λ ) {displaystyle heta =(psi ,lambda )} , where ψ {displaystyle psi } is the actual parameter of interest, and λ {displaystyle lambda } is a non-interesting nuisance parameter. If there exists a probability distribution for λ {displaystyle lambda } , it is often desirable to consider the likelihood function only in terms of ψ {displaystyle psi } , by marginalizing out λ {displaystyle lambda } : Unfortunately, marginal likelihoods are generally difficult to compute. Exact solutions are known for a small class of distributions, particularly when the marginalized-out parameter is the conjugate prior of the distribution of the data. In other cases, some kind of numerical integration method is needed, either a general method such as Gaussian integration or a Monte Carlo method, or a method specialized to statistical problems such as the Laplace approximation, Gibbs/Metropolis sampling, or the EM algorithm. It is also possible to apply the above considerations to a single random variable (data point) x {displaystyle x} , rather than a set of observations. In a Bayesian context, this is equivalent to the prior predictive distribution of a data point. In Bayesian model comparison, the marginalized variables are parameters for a particular type of model, and the remaining variable is the identity of the model itself. In this case, the marginalized likelihood is the probability of the data given the model type, not assuming any particular model parameters. Writing θ for the model parameters, the marginal likelihood for the model M is It is in this context that the term model evidence is normally used. This quantity is important because the posterior odds ratio for a model M1 against another model M2 involves a ratio of marginal likelihoods, the so-called Bayes factor:

[ "Likelihood function", "Bayesian probability", "Maximum likelihood" ]
Parent Topic
Child Topic
    No Parent Topic