Music classification using high-level models

2010 
We report here about our submissions to different music classification tasks for the MIREX 2010 evaluations. These submissions are similar to the ones sent at MIREX 2009 (see [1]), if we look at the classifiers and the main audio features. However we added high-level features (or semantic features), based on Support Vector Machine models of curated databases of different kind. We submitted two different algorithms evaluated on Mood, Genre and Artists classification. One of them is a classification algorithm using a weighted sum of Support Vector Machines. The other one is based on distances (Euclidean in a reduced space using RCA and Kullback Leibler on Mel Frequency Cepstrum Coefficients), together with K-NN. 1. FEATURE EXTRACTION This submission is coded in C++ and python. For the feature extraction part, we use an internal library of the Music Technology Group called Essentia [2]. This library contains all the features mentioned below. All frame-based statistics are aggregated using : mean and derivatives until second order, variance and derivatives until second order, minimum and maximum. We divide our features in two main categories. The ”base” features which are state-ofthe-art MIR features and the ”high-level” features. 1.1 Base features In Table 2 is the set of base features that performed the best in our preliminary experiment made on our genre, artist and mood databases. 1.2 High-level features One of the originality of our approach is the integration of high-level (or semantic) descriptors. Low level features are convenient and easy to extract. They provide satisfying classification results in many tasks. However, high-level concepts encapsulate different pattern of low-level descriptors into a single representation that can add useful information. Based on this idea, we added high level features Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. c © 2009 International Society for Music Information Retrieval. Type Features Low level barkbands spread, skewness, kurtosis, dissonance, hfc pitch and confidence, pitch salience, spectral complexity spectral crest, spectral decrease, energy, spectral flux spec spread/skewness/kurtosis, spec rolloff, strong peak ZCR, barkbands, mfcc, spectral contrast Rhythm bpm, beats loudness, onset rate Sound FX inharmonicity, odd2even, pitch centroid, tristimulus Tonal chords strength (frame), key strength(global), tuning freq Table 1. Feature set for all our classifiers. of different categories. These models are pre-trained algorithms using Support Vector Machines that are added to our bag of features. We consider them as other features with value between 0 and 1 corresponding to the SVM model prediction probability. Here we list the different models used:
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    5
    References
    6
    Citations
    NaN
    KQI
    []