Background The explosively radiating evolution of cichlid fishes of Lake Malawi has yielded an amazing number of haplochromine species estimated as many as 500 to 800 with a surprising degree of diversity not only in color and stripe pattern but also in the shape of jaw and body among them. As these morphological diversities have been a central subject of adaptive speciation and taxonomic classification, such high diversity could serve as a foundation for automation of species identification of cichlids. Methodology/Principal Finding Here we demonstrate a method for automatic classification of the Lake Malawi cichlids based on computer vision and geometric morphometrics. For this end we developed a pipeline that integrates multiple image processing tools to automatically extract informative features of color and stripe patterns from a large set of photographic images of wild cichlids. The extracted information was evaluated by statistical classifiers Support Vector Machine and Random Forests. Both classifiers performed better when body shape information was added to the feature of color and stripe. Besides the coloration and stripe pattern, body shape variables boosted the accuracy of classification by about 10%. The programs were able to classify 594 live cichlid individuals belonging to 12 different classes (species and sexes) with an average accuracy of 78%, contrasting to a mere 42% success rate by human eyes. The variables that contributed most to the accuracy were body height and the hue of the most frequent color. Conclusions Computer vision showed a notable performance in extracting information from the color and stripe patterns of Lake Malawi cichlids although the information was not enough for errorless species identification. Our results indicate that there appears an unavoidable difficulty in automatic species identification of cichlid fishes, which may arise from short divergence times and gene flow between closely related species.
The genus Iksookimia contains six species of primary freshwater fishes that are endemic to South Korea. Previous phylogenetic studies, based on DNA sequence data from three or fewer loci, have suggested non-monophyly of the genus, providing inconsistent resolutions of the relationships of Iksookimia. Our coalescent and concatenation-based phylogenetic analyses, utilizing seven unlinked nuclear-encoded genes, strongly supported Iksookimia as a monophyletic group, emphasizing the importance of multi-locus data in investigating complicated phylogenetic relationships. A relaxed molecular clock analysis using fossil calibrations, indicated that the origin of the major lineages of Iksookimia occurred between ∼12 to 5 Ma, which is consistent with the Miocene uplift of the Taebaek and Sobaek Mountains and the Miocene activation of the major south-eastern faults. These palaeogeographic events may have served as vicariant events in the diversification of Iksookimia.