Common CNN-based Face Embedding Spaces are (Almost) Equivalent

2020 
CNNs are the dominant method for creating face embeddings for recognition. It might be assumed that, since these networks are distinct, complex, nonlinear functions, that their embeddings are network specific, and thus have some degree of anonymity. However, recent research has shown that distinct networks' features can be directly mapped with little performance penalty (median 1.9% reduction across 90 distinct mappings) in the context of the 1,000 object ImageNet recognition task. This finding has revealed that embeddings coming from different systems can be meaningfully compared, provided the mapping. However, prior work only considered networks trained and tested on a closed set classification task. Here, we present evidence that a linear mapping between feature spaces can be easily discovered in the context of open set face recognition. Specifically, we demonstrate that the feature spaces of four face recognition models, of varying architecture and training datasets, can be mapped between with no more than a 1.0% penalty in recognition accuracy on LFW . This finding, which we also replicate on YouTube Faces, demonstrates that embeddings from different systems can be readily compared once the linear mapping is determined. In further analysis, fewer than 500 pairs of corresponding embeddings from two systems are required to calculate the full mapping between embedding spaces, and reducing the dimensionality of the mapping from 512 to 64 produces negligible performance penalty.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    62
    References
    2
    Citations
    NaN
    KQI
    []