Data Clustering and Self-Organizing Maps in Biology

2019 
Abstract In his 1936 article, “The use of multiple measurements in taxonomic problems,” statistician and biologist Ronald Fisher published a data set that looked at 50 samples from each of three species of Iris flower: Iris setosa , Iris virginica , and Iris versicolor . Each sample consisted of the length and width of the flower sepal and the length and width of the petals, where all four measurement components are in centimeters. Fig. 11.1 is a collection of scatter plots that compares the data of the three species according to plots of the different measurement components. We can clearly see from the plots that the points seem to separate into three fairly distinct groups despite the overlap between I. versicolor and I. virginica . The existence of discriminating features between the three species allowed Fisher to develop a model capable of classifying measurement observations into the correct species of Iris flower.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    6
    Citations
    NaN
    KQI
    []