A Comparison of Developmental Scales Based on Thurstone Methods and Item Response Theory.

1998 
A developmental scale for the North Carolina End-of-Grade Mathematics Tests was created using a subset of identical test forms administered to adjacent grade levels. Thurstone scaling and item response theory (IRT) techniques were employed to analyze the changes in grade distributions across these linked forms. Three variations of Thurstone scaling were examined, one based on Thurstone's 1925 procedure and two based on Thurstone's 1938 procedure. The IRT scaling was implemented using both BIMAIN and MULTILOG. All methods indicated that average mathematics performance improved from Grade 3 to Grade 8, with similar results for the two IRT analyses and one version of Thurstone's 1938 method. The standard deviations of the IRT scales did not show a consistent pattern across grades, whereas those produced by Thurstone's 1925 procedure generally decreased; one version of the 1938 method exhibited slightly increasing variation with increasing grade level, while the other version displayed inconsistent trends. In educational testing, multilevel tests enable the evaluation of student perfor
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    35
    References
    46
    Citations
    NaN
    KQI
    []