A Speaker Identity Recognition System based on Deep Learning

2019 
Abstract This invention is a branch of pattern recognition and lies in the field of digital signal processing. It is a speech recognition system of identifying different people speaking based on deep learning. The invention consist of several steps. First of all, we collected a sufficient number of audio recordings from six people. The data set will be separated into training set and test set. Training set is preprocessed by framing and detection of effective segment. Secondly, features of the recordings will be extracted by using the technique MFCC (Mel Frequency Cepstral Coefficient). The data will then be examined by the designed structure, which consists of Convolutional Neural Networks, Pooling, and fully connected layers. For the last part of the invention, the test set of data is put into the structured neural network and the accuracy of recognizing different speaking with the accuracy of XXX. In brief, this invention can be used as intelligent voice control like Siri. Begin Set the start point to calculate E_low fsomeframe some frame have consecutive which has haveconsecutive frames' energy calculate consecutive Get start which are bigger E.high frames which point than Elow Yes energy is bigger # - Yes than E-highTh Thhe The Find the current current end point +-frame No frame back three backs frames three No frames some frames |\o If the span have consecutive between start frames which point and end energy is smaller point is smaller than EHigh _ _ than the smallest length Get the Ys end point Na End Figure 3
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []