Prediction Accuracy in Logistic Biplots for categorical data.

2015 
Classical biplot methods allow for the simultaneous representation of individuals (rows) and variables (columns) of a numerical data matrix. When data are binary, nominal or ordinal, classical linear biplots are not adequate; other techniques such as multiple correspondence analysis (MCA), latent trait analysis (LTA) or item response theory (IRT) for categorical items should be used instead. We have recently extended the biplot to categorical data. The resulting method is termed `` logistic biplot''(LB) because the resulting procedure is related to logistic responses in the same way classical biplots are related to linear responses. For the nominal case, variables are represented as convex prediction regions rather than vectors; using the methods from computational geometry, the set of prediction regions is converted to a set of points in such a way that the prediction for each individual is established by its closest ``category point''. Then interpretation is based on distances rather than on projections. For the binary and ordinal cases, the final representation is more like a traditional biplot with straight lines for predicting probabilities for each variable. The prediction regions are delimited by parallel straight lines and then a line with the adequate marks is enough to visualize the model. We evaluate prediction accuracy of logistic biplots compared to MCA and IRT. The main differences between the LB and MCA are shown with data from demographic and labor market variables of doctorate (PdH) holders in the region of Castilla-Leon in Spain, using the 'Survey on the careers of doctorate holders (CDH)' carried out by Spanish Statistical Institute jointly with Eurostat, the Organization for Economic Co-operation and Development (OECD) and UNESCO's Institute for Statistics (UIS).
    • Correction
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []