Generalized T-Statistic
2019
We discuss a statistical method for the classification problem with two groups \(y=0\) and \(y=1\). We envisage a situation in which the conditional distribution of \(y=0\) is well specified by a normal distribution, but the conditional distribution of \(y=1\) (rare observations in imbalanced data sets) is not well modeled by any specific distribution. Typically in a case-control study, the distribution in the control group can be assumed to be normal via an appropriate data transformation, whereas the distribution in the case group may depart from normality. In this situation, the maximum t-statistic for linear discrimination, or equivalently the Fisher’s linear discriminant function, may not be optimal. We propose a class of generalized t-statistics and study asymptotic consistency and normality. The optimal generalized t-statistic in the sense of asymptotic variance is derived in a semi-parametric manner, and its statistical performance is confirmed in several numerical experiments.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
11
References
0
Citations
NaN
KQI