In probability theory, particularly information theory, the conditional mutual information is, in its most basic form, the expected value of the mutual information of two random variables given the value of a third. I ( X ; Y | Z ) = ∫ Z D K L ( P ( X , Y ) | Z ‖ P X | Z ⊗ P Y | Z ) d P Z {displaystyle I(X;Y|Z)=int _{mathcal {Z}}D_{mathrm {KL} }(P_{(X,Y)|Z}|P_{X|Z}otimes P_{Y|Z})dP_{Z}} I ( X ; Y | Z ) = ∑ z ∈ Z ∑ y ∈ Y ∑ x ∈ X p X , Y , Z ( x , y , z ) log p Z ( z ) p X , Y , Z ( x , y , z ) p X , Z ( x , z ) p Y , Z ( y , z ) . {displaystyle I(X;Y|Z)=sum _{zin {mathcal {Z}}}sum _{yin {mathcal {Y}}}sum _{xin {mathcal {X}}}p_{X,Y,Z}(x,y,z)log {frac {p_{Z}(z)p_{X,Y,Z}(x,y,z)}{p_{X,Z}(x,z)p_{Y,Z}(y,z)}}.} I ( X ; Y | Z ) = ∫ Z ∫ Y ∫ X log ( p Z ( z ) p X , Y , Z ( x , y , z ) p X , Z ( x , z ) p Y , Z ( y , z ) ) p X , Y , Z ( x , y , z ) d x d y d z . {displaystyle I(X;Y|Z)=int _{mathcal {Z}}int _{mathcal {Y}}int _{mathcal {X}}log {igl (}{frac {p_{Z}(z)p_{X,Y,Z}(x,y,z)}{p_{X,Z}(x,z)p_{Y,Z}(y,z)}}{igr )}p_{X,Y,Z}(x,y,z)dxdydz.} In probability theory, particularly information theory, the conditional mutual information is, in its most basic form, the expected value of the mutual information of two random variables given the value of a third. For random variables X {displaystyle X} , Y {displaystyle Y} , and Z {displaystyle Z} with support sets X {displaystyle {mathcal {X}}} , Y {displaystyle {mathcal {Y}}} and Z {displaystyle {mathcal {Z}}} , we define the conditional mutual information as This may be written in terms of the expectation operator: I ( X ; Y | Z ) = E Z [ D K L ( P ( X , Y ) | Z ‖ P X | Z ⊗ P Y | Z ) ] {displaystyle I(X;Y|Z)=mathbb {E} _{Z}} . Thus I ( X ; Y | Z ) {displaystyle I(X;Y|Z)} is the expected (with respect to Z {displaystyle Z} ) Kullback–Leibler divergence from the conditional joint distribution P ( X , Y ) | Z {displaystyle P_{(X,Y)|Z}} to the product of the conditional marginals P X | Z {displaystyle P_{X|Z}} and P Y | Z {displaystyle P_{Y|Z}} . Compare with the definition of mutual information. For discrete random variables X {displaystyle X} , Y {displaystyle Y} , and Z {displaystyle Z} with support sets X {displaystyle {mathcal {X}}} , Y {displaystyle {mathcal {Y}}} and Z {displaystyle {mathcal {Z}}} , the conditional mutual information I ( X ; Y | Z ) {displaystyle I(X;Y|Z)} is as follows where the marginal, joint, and/or conditional probability mass functions are denoted by p {displaystyle p} with the appropriate subscript. This can be simplified as