Dyslexia is a learning disability that makes reading and language-related tasks more challenging. In recent years, the use of eye tracking to analyze eye movements during reading in children with dyslexia has gained popularity and received increased attention in the community. However, most previous studies have focused on English-speaking children, leaving limited research applicable to Vietnamese-speaking children, whose language exhibits distinct characteristics compared to English. This study aims to develop a system that utilizes eye-tracking technology to provide a visual representation of the eye movements in Vietnamese children with dyslexia, helping us understand their visual strategies during reading. We have designed a system capable of capturing eye movements of children through several tests that are suitable for Vietnamese kids. This allows us to analyze specific eye characteristics during reading, such as fixations and saccades, and visualize them using heatmap or scan path. By conducting tests on both children with dyslexia and children with typical development to initially compare their eye movement patterns, the findings suggest the potential use of this system in addressing subsequent issues, such as detection, intervention, and application development aimed at assisting Vietnamese children with dyslexia based on their visual reading strategies.
There are many documents written in Sino-Nom capturing a wide range of historical, political, and literary facets of Vietnamese culture. To contribute to the process of preserving those documents, we focus on the task of building Optical Character Recognition models for Sino-Nom characters using a semi-supervised approach. Also, we propose a pipeline for collecting needed Sino-Nom document images which are used for training and evaluating our models. Being evaluated on our collected dataset, our OCR baseline achieves F1 scores of over 0.97 in detection and top-l and top-5 accuracies of 80.1% and 90.1% in recognition, respectively.
Background The eye-tracking-based communication systems open up opportunities for interaction in the lives of many patients with severe motor impairments. In these systems, users need to focus their gaze on a key until it is fully entered. However, these systems are not truly refined in terms of design and interaction efficiency, and they may still cause discomfort for users, such as eye strain or typing errors. Some solutions have been proposed to address this issue, but so far, no comprehensive solution has been found. Objective In this research, we have proposed a novel method to adjust data inputting speed in human-computer interfaces controlled by eye gaze and electroencephalography (EEG) data. Methods It combines EEG data to extract the user’s attention level. We then flexibly adjust the time users need to keep their gaze on a key on the system and personalize the user experience using their own data. This approach aims to enhance interaction efficiency by increasing interaction speed while maintaining accuracy. We evaluated this method on an eye-tracking-based spelling communication system for Vietnamese people, proposed by our team, involving 20 healthy individuals and 4 people with motor function impairments. Results The results indicated that communication speed through the system increased by 20–80% for participants, and not only did the time improve, but the communication effectiveness also increased linearly. This outcome was achieved for both healthy individuals and patients. Conclusions Through experimentation, we have demonstrated the feasibility and effectiveness of our approach, showing improvements in typing speed over successive trials. This result is highly meaningful as it optimizes dwell time and interaction efficiency in the ET-based system without having to compromise by increasing the error rate when directly reducing dwell time.
Background Cell-free circulating DNA (cfDNA) fragments exhibit non-random patterns in their length (FLEN), end-motif (EM), and distance to nucleosome position (ND). While these cfDNA features have shown promise as inputs for machine learning and deep learning models in early cancer detection, most studies utilize them as raw inputs, overlooking the potential benefits of pre-processing to extract cancer-specific features. This study aims to enhance cancer detection accuracy by developing a novel approach to feature extraction from cfDNA fragmentomics. Methods We implemented a supervised non-negative matrix factorization (SNMF) algorithm to generate embedding vectors capturing cancer-specific signals within cfDNA fragmentomic features. These embeddings served as input for a machine learning model to classify cancer patients from healthy individuals. Results We validated our framework using two datasets: an in-house cohort of 431 cancer patients and 442 healthy individuals (dataset 1), and a published cohort comprising 90 hepatocellular carcinoma (HCC) patients and 103 individuals with cirrhosis or hepatitis B (dataset 2). In dataset 1, we achieved an AUC of 94% in pan-cancer detection. In dataset 2, our framework achieved an AUC of 100% for HCC vs healthy classification, 99% for HCC vs non-HCC patients classification, and 96% for identifying HCC patients among a mixed group of non-HCC patients and healthy donors. Conclusion This study demonstrates the efficiency of SNMF-transformed features in improving both pan-cancer detection and specific HCC detection. Our approach offers a significant advancement in leveraging cfDNA fragmentomics for early cancer detection, potentially enhancing diagnostic accuracy in clinical settings.
3D human pose reconstruction from single-view camera is a difficult and challenging topic. Many approaches have been proposed, but almost focusing on frame-by-frame independently while inter-frames are highly correlated in a pose sequence. In contrast, we introduce a novel spatial-temporal 3D reconstruction framework that leverages both intra and inter frame relationships in consecutive 2D pose sequences. Orthogonal Matching Pursuit (OMP) algorithm, pre-trained Pose-angle Limits and Temporal Models have been implemented. We quantitatively compare our framework versus recent works on CMU motion capture dataset and Vietnamese traditional dance sequences. Our method outperforms others with 10 percent lower of Euclidean reconstruction error and robustness against Gaussian noise. Additionally, it is also important to mention that our reconstructed 3D pose sequences are smoother and more natural than others.