On the predictive potential of kernel principal components
2020
We give a probabilistic analysis of a phenomenon in statistics
which, until recently, has not received a convincing explanation.
This phenomenon is that the leading principal
components tend to possess more predictive power for a response variable than
lower-ranking ones despite the procedure being unsupervised.
Our result, in its most general form, shows that the phenomenon goes far beyond the
context of linear regression and classical principal components ---
if an arbitrary distribution for the predictor $X$ and
an arbitrary conditional distribution for $Y \vert X$ are chosen
then any measureable function $g(Y)$, subject to a mild condition,
tends to be more correlated with the higher-ranking kernel principal
components than with the lower-ranking ones.
The ``arbitrariness'' is formulated in terms of unitary invariance then
the tendency is explicitly quantified by exploring how unitary invariance
relates to the Cauchy distribution.
The most general results, for technical reasons, are shown for the case where
the kernel space is finite dimensional. The occurency of this tendency in real
world databases is also investigated to show that our results are
consistent with observation.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
34
References
3
Citations
NaN
KQI