Compound identification using random projection for gas chromatography–mass spectrometry data

2016 
Abstract In general, compound identification through library searching is performed on original mass spectral space by using some developed similarity measure. More powerful similarity measure need more computational time. In order to reduce computational time, the original mass spectral space was transformed into binary space by random projection. The hamming distance between query and reference the vector of binary space are calculated. In this study, Mass Spectral Library 2005 (NIST05) main library containing 163,195 mass spectra is used as reference database. The replicate library containing 23,290 mass spectra is used as query data. The proposed method was used to compare with existing five similarity measures. Among two proposed performance criteria, composite semi-partial similarity measure achieve the best identification accuracy, then follow the random projection. When the binary number of random projection set as 2176 bits, the identification accuracy of rank 1 reach to 83.1%. If only considering computational time, random projection is the most rapid approach among five algorithms. Therefore, random projection can be also used as a filter to create searching sub-library for other complicated similarity measures. A series experiments are conducted by using random projection filter, the experimental results show the computational time of other similarity measure is reduced and the identification performance is not sacrificed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    4
    Citations
    NaN
    KQI
    []