A Multivariate Analysis Approach to Diamonds’ Pricing Using Dummy Variables in SPSS

2021 
In the present article, a distinctive methodology to predict the price of diamonds is proposed. To do so, a vast dataset was used encompassing empirical data relative to variables commonly used to assess the commercial value of diamonds. The selected dataset was retrieved from the Kaggle website [1] and includes diamonds’ physical properties along with their respective price. Therefore, a data analysis based on Analysis of Variance (ANOVA) was conducted to study the diamonds’ characteristics that determine their prices. It was found that the weight of the diamond (carat) has an impact on diamonds’ price. For this reason, diamonds’ price per carat was considered as a new dependent variable. Moreover, the variables diamonds’ clarity and the width of the top of the diamond (table) affect the dependent variable. After applying the stepwise regression methods, it was found that the variables related to the width of the largest section of the diamond (Y), and table were the least significant ones. Moreover, both backward and forward selection led to the same result in terms of the predictive model. All the residuals’ assumptions were validated. The adjusted coefficient of determination was \(88.3\%\). Since multicollinearity effects can exist between the independent variables, Principal Components Analysis (PCA) can be used, as future work, to eliminate these effects.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    0
    Citations
    NaN
    KQI
    []