language-icon Old Web
English
Sign In

Information distance

Information distance is the distance between two finite objects (represented as computer files) expressed as the number of bits in the shortest program which transforms one object into the other one or vice versa on a universal computer. This is an extension of Kolmogorov complexity. The Kolmogorov complexity of a single finite object is the information in that object; the information distance between a pair of finite objects is the minimum information required to go from one object to the other or vice versa.Information distance was first defined and investigated in based on thermodynamic principles, see also. Subsequently, it achieved final form in. It is applied in the normalized compression distance and the normalized Google distance. Information distance is the distance between two finite objects (represented as computer files) expressed as the number of bits in the shortest program which transforms one object into the other one or vice versa on a universal computer. This is an extension of Kolmogorov complexity. The Kolmogorov complexity of a single finite object is the information in that object; the information distance between a pair of finite objects is the minimum information required to go from one object to the other or vice versa.Information distance was first defined and investigated in based on thermodynamic principles, see also. Subsequently, it achieved final form in. It is applied in the normalized compression distance and the normalized Google distance. Formally the information distance I D ( x , y ) {displaystyle ID(x,y)} between x {displaystyle x} and y {displaystyle y} is defined by with p {displaystyle p} a finite binary program for the fixed universal computerwith as inputs finite binary strings x , y {displaystyle x,y} . In it is proven that I D ( x , y ) = E ( x , y ) + O ( log ⋅ max { K ( x ∣ y ) , K ( y ∣ x ) } ) {displaystyle ID(x,y)=E(x,y)+O(log cdot max{K(xmid y),K(ymid x)})} with where K ( ⋅ ∣ ⋅ ) {displaystyle K(cdot mid cdot )} is the Kolmogorov complexity defined by of the prefix type. This E ( x , y ) {displaystyle E(x,y)} is the important quantity. Let Δ {displaystyle Delta } be the class of upper semicomputable distances D ( x , y ) {displaystyle D(x,y)} that satisfy the density condition This excludes irrelevant distances such as D ( x , y ) = 1 2 {displaystyle D(x,y)={frac {1}{2}}} for x ≠ y {displaystyle x eq y} ;it takes care that if the distance growth then the number of objects within that distance of a geven object grows.If D ∈ Δ {displaystyle Din Delta } then E ( x , y ) ≤ D ( x , y ) {displaystyle E(x,y)leq D(x,y)} up to a constant additive term. The distance E ( x , y ) {displaystyle E(x,y)} is a metric up to an additive O ( log . max { K ( x ∣ y ) , K ( y ∣ x ) } ) {displaystyle O(log .max{K(xmid y),K(ymid x)})} term in the metric (in)equalities. If E ( x , y ) = K ( x ∣ y ) {displaystyle E(x,y)=K(xmid y)} , then there is a program p {displaystyle p} of length K ( x ∣ y ) {displaystyle K(xmid y)} that converts y {displaystyle y} to x {displaystyle x} , and a program q {displaystyle q} of length K ( y ∣ x ) − K ( x ∣ y ) {displaystyle K(ymid x)-K(xmid y)} such that the program q p {displaystyle qp} converts x {displaystyle x} to y {displaystyle y} . (The programs are of the self-delimiting format which means that one can decide where one program ends and the other begins in concatenation of the programs.) That is, the shortest programs to convert between two objects can be made maximally overlapping: For K ( x ∣ y ) ≤ K ( y ∣ x ) {displaystyle K(xmid y)leq K(ymid x)} it can be divided into a program that converts object x {displaystyle x} to object y {displaystyle y} , and another program which concatenated with the first converts y {displaystyle y} to x {displaystyle x} while the concatenation of these two programs is a shortest program to convert between these objects. The programs to convert between objects x {displaystyle x} and y {displaystyle y} can also be made minimal overlapping.There exists a program p {displaystyle p} of length K ( x ∣ y ) {displaystyle K(xmid y)} up to an additive term of O ( log ⁡ ( max { K ( x ∣ y ) , K ( y ∣ x ) } ) ) {displaystyle O(log(max{K(xmid y),K(ymid x)}))} that maps y {displaystyle y} to x {displaystyle x} and has small complexity when x {displaystyle x} is known ( K ( p ∣ x ) ≈ 0 {displaystyle K(pmid x)approx 0} ). Interchanging the two objects we have the other program Having in mind the parallelism between Shannon information theory and Kolmogorov complexity theory, one can say that this result is parallel to the Slepian-Wolf and Körner–Imre Csiszár–Marton theorems.

[ "Algorithm", "Machine learning", "Data mining", "Artificial intelligence", "Pattern recognition" ]
Parent Topic
Child Topic
    No Parent Topic