Eye movements have been proven the most frequent of all human activities; therefore, research on a relationship between different eye movement patterns become a hotspot in human-computer interface fields. The motivation of this paper is to develop a reading auxiliary apparatus by measuring and analyzing the electrooculography signals. We first describe the saccade detection algorithm based on the wavelet packet decomposition and the derivation blink detection algorithm. Furthermore, consecutive blinks were used to control the system's working state and a magnifier, whose position is adjusted according to the results of saccade detection. Experiential results on six participants show that recognition accuracy ratio (F1 score) is 90.096%, which reveal that the proposed system has a good recognition performance on reading activity detection and analysis.
Anomaly detection is a critical task in industrial manufacturing, aiming to identify defective parts of products. Most industrial anomaly detection methods assume the availability of sufficient normal data for training. This assumption may not hold true due to the cost of labeling or data privacy policies. Additionally, mainstream methods require training bespoke models for different objects, which incurs heavy costs and lacks flexibility in practice. To address these issues, we seek help from Stable Diffusion (SD) model due to its capability of zero/few-shot inpainting, which can be leveraged to inpaint anomalous regions as normal. In this paper, a few-shot multi-class anomaly detection framework that adopts Stable Diffusion model is proposed, named AnomalySD. To adapt SD to anomaly detection task, we design different hierarchical text descriptions and the foreground mask mechanism for fine-tuning SD. In the inference stage, to accurately mask anomalous regions for inpainting, we propose multi-scale mask strategy and prototype-guided mask strategy to handle diverse anomalous regions. Hierarchical text prompts are also utilized to guide the process of inpainting in the inference stage. The anomaly score is estimated based on inpainting result of all masks. Extensive experiments on the MVTec-AD and VisA datasets demonstrate the superiority of our approach. We achieved anomaly classification and segmentation results of 93.6%/94.8% AUROC on the MVTec-AD dataset and 86.1%/96.5% AUROC on the VisA dataset under multi-class and one-shot settings.
Recently, pioneer research works have proposed a large number of acoustic features (log power spectrogram, linear frequency cepstral coefficients, constant Q cepstral coefficients, etc.) for audio deepfake detection, obtaining good performance, and showing that different subbands have different contributions to audio deepfake detection. However, this lacks an explanation of the specific information in the subband, and these features also lose information such as phase. Inspired by the mechanism of synthetic speech, the fundamental frequency (F0) information is used to improve the quality of synthetic speech, while the F0 of synthetic speech is still too average, which differs significantly from that of real speech. It is expected that F0 can be used as important information to discriminate between bonafide and fake speech, while this information cannot be used directly due to the irregular distribution of F0. Insteadly, the frequency band containing most of F0 is selected as the input feature. Meanwhile, to make full use of the phase and full-band information, we also propose to use real and imaginary spectrogram features as complementary input features and model the disjoint subbands separately. Finally, the results of F0, real and imaginary spectrogram features are fused. Experimental results on the ASVspoof 2019 LA dataset show that our proposed system is very effective for the audio deepfake detection task, achieving an equivalent error rate (EER) of 0.43%, which surpasses almost all systems.
Heart rate is closely related to physiological and psychological states, and video-based techniques such as Imaging Photoplethysmography (IPPG) have been developed for heart rate detection. Although there have been some methods based on IPPG that are used to address the impact of illumination changes on heart rate detection, these methods perform poorly in environments with intense or complex illumination. This study proposes a framework that uses normalized least mean square adaptive filtering and singular spectrum analysis to combat the effects of illumination changes on heart rate detection. Experimental results on a dataset comprising 13 men and women aged 20 to 28 demonstrate the feasibility of our method under illumination changes.
Independent Component Analysis (ICA) was often used to separate movement related independent components (MRICs) from Electroencephalogram (EEG) data. However, to obtain robust spatial filters, complex characteristic features, which were manually selected in most cases, have been commonly used. This study proposed a new simple algorithm to extract MRICs automatically, which just utilized the spatial distribution pattern of ICs. The main goal of this study was to show the relationship between spatial filters performance and designing samples. The EEG data which contain mixed brain states (preparing, motor imagery and rest) were used to design spatial filters. Meanwhile, the single class data was also used to calculate spatial filters to assess whether the MRICs extracted on different class motor imagery spatial filters are similar. Furthermore, the spatial filters constructed on one subject's EEG data were applied to extract the others' MRICs. Finally, the different spatial filters were then applied to single-trial EEG to extract MRICs, and Support Vector Machine (SVM) classifiers were used to discriminate left hand、right-hand and foot imagery movements of BCI Competition IV Dataset 2a, which recorded four motor imagery data of nine subjects. The results suggested that any segment of finite motor imagery EEG samples could be used to design ICA spatial filters, and the extracted MRICs are consistent if the position of electrodes are the same, which confirmed the robustness and practicality of ICA used in the motor imagery Brain Computer Interfaces (MI-BCI) systems.