Energy distance

Energy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in Rd with cumulative distribution functions (cdf) F and G respectively, then the energy distance between the distributions F and G is defined to be the square root of Energy distance is a statistical distance between probability distributions. If X and Y are independent random vectors in Rd with cumulative distribution functions (cdf) F and G respectively, then the energy distance between the distributions F and G is defined to be the square root of where (X, X', Y, Y') are independent, the cdf of X and X' is F, the cdf of Y and Y' is G, E {displaystyle operatorname {E} } is the expected value, and || . || denotes the length of a vector. Energy distance satisfies all axioms of a metric thus energy distance characterizes the equality of distributions: D(F,G) = 0 if and only if F = G.Energy distance for statistical applications was introduced in 1985 by Gábor J. Székely, who proved that for real-valued random variables D 2 ( F , G ) {displaystyle D^{2}(F,G)} is exactly twice Harald Cramér's distance: For a simple proof of this equivalence, see Székely (2002). In higher dimensions, however, the two distances are different because the energy distance is rotation invariant while Cramér's distance is not. (Notice that Cramér's distance is not the same as the distribution-free Cramer-von-Mises criterion.) One can generalize the notion of energy distance to probability distributions on metric spaces. Let ( M , d ) {displaystyle (M,d)} be a metric space with its Borel sigma algebra B ( M ) {displaystyle {mathcal {B}}(M)} . Let P ( M ) {displaystyle {mathcal {P}}(M)} denote the collection of all probability measures on the measurable space ( M , B ( M ) ) {displaystyle (M,{mathcal {B}}(M))} . If μ and ν are probability measures in P ( M ) {displaystyle {mathcal {P}}(M)} , then the energy-distance D {displaystyle D} of μ and ν can be defined as the square root of This is not necessarily non-negative, however. If ( M , d ) {displaystyle (M,d)} is a strongly negative definite kernel, then D {displaystyle D} is a metric, and conversely. This condition is expressed by saying that ( M , d ) {displaystyle (M,d)} has negative type. Negative type is not sufficient for D {displaystyle D} to be a metric; the latter condition is expressed by saying that ( M , d ) {displaystyle (M,d)} has strong negative type. In this situation, the energy distance is zero if and only if X and Y are identically distributed. An example of a metric of negative type but not of strong negative type is the plane with the taxicab metric. All Euclidean spaces and even separable Hilbert spaces have strong negative type. In the literature on kernel methods for machine learning, these generalized notions of energy distance are studied under the name of maximum mean discrepancy. Equivalence of distance based and kernel methods for hypothesesis testing is covered by several authors. A related statistical concept, the notion of E-statistic or energy-statistic was introduced by Gábor J. Székely in the 1980s when he was giving colloquium lectures in Budapest, Hungary and at MIT, Yale, and Columbia. This concept is based on the notion of Newton’s potential energy. The idea is to consider statistical observations as heavenly bodies governed by a statistical potential energy which is zero only when an underlying statistical null hypothesis is true. Energy statistics are functions of distances between statistical observations. Energy distance and E-statistic were considered as N-distances and N-statistic in Zinger A.A., Kakosyan A.V., Klebanov L.B. Characterization of distributions by means of mean values of some statistics in connection with some probability metrics, Stability Problems for Stochastic Models. Moscow, VNIISI, 1989,47-55. (in Russian), English Translation: A characterization of distributions by mean values of statistics and certain probabilistic metrics A. A. Zinger, A. V. Kakosyan, L. B. Klebanov in Journal of Soviet Mathematics (1992). In the same paper there was given a definition of strongly negative definite kernel, and provided a generalization on metric spaces, discussed above. The book gives these results and their applications to statistical testing as well. The book contains also some applications to recovering the measure from its potential.

Parent Topic

Child Topic

No Parent Topic