Data Mining for Aviation Safety: Using Data Mining Recipe “Automatized Data Mining” from STATISTICA

This chapter uses the Federal Aviation Administration's (FAA's) Service Difficulty Report (SDR) database to explore factors that lead to undesirable events called unscheduled landings and determine which data mining (DM) model or models appear to be the most predictive of that event. The database used is very complex, and much of it is in text form. This chapter briefly discusses airline safety and data mining's importance in improving safety; introduces and describes the SDR database; and prepares the data for study. It further describes DM approach using STATISTICA by StatSoft and determines which DM algorithm appears to produce the most accurate results in predicting unscheduled landings based on error rate. STATISTICA can determine on its own what type of data is contained in the spreadsheet; it is often helpful for the user to specify the type of data. The SDR database contains a wealth of information that can be explored through data mining techniques to search for patterns and trends. One can determine that the aircraft manufacturer's name is the most important variable from the four selected, and stage of operation is the least important. Aviation safety has much to gain through the use of automated data mining tools and existing databases. The ultimate goal is to use the information DM can yield to drive the airline accident rate even lower than it is today.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader