Scene text recognition is a rapidly developing field that faces numerous challenges due to the complexity and diversity of scene text, including complex backgrounds, diverse fonts, flexible arrangements, and accidental occlusions. In this paper, we propose a novel approach called Class-Aware Mask-guided feature refinement (CAM) to address these challenges. Our approach introduces canonical class-aware glyph masks generated from a standard font to effectively suppress background and text style noise, thereby enhancing feature discrimination. Additionally, we design a feature alignment and fusion module to incorporate the canonical mask guidance for further feature refinement for text recognition. By enhancing the alignment between the canonical mask feature and the text feature, the module ensures more effective fusion, ultimately leading to improved recognition performance. We first evaluate CAM on six standard text recognition benchmarks to demonstrate its effectiveness. Furthermore, CAM exhibits superiority over the state-of-the-art method by an average performance gain of 4.1% across six more challenging datasets, despite utilizing a smaller model size. Our study highlights the importance of incorporating canonical mask guidance and aligned feature refinement techniques for robust scene text recognition. The code is available at https://github.com/MelosY/CAM.
Abstract Parkinson’s disease (PD) is a serious neurodegenerative disorder marked by significant clinical and progression heterogeneity. This study aimed at addressing heterogeneity of PD through integrative analysis of various data modalities. We analyzed clinical progression data (≥5 years) of individuals with de novo PD using machine learning and deep learning, to characterize individuals’ phenotypic progression trajectories for PD subtyping. We discovered three pace subtypes of PD exhibiting distinct progression patterns: the Inching Pace subtype (PD-I) with mild baseline severity and mild progression speed; the Moderate Pace subtype (PD-M) with mild baseline severity but advancing at a moderate progression rate; and the Rapid Pace subtype (PD-R) with the most rapid symptom progression rate. We found cerebrospinal fluid P-tau/α-synuclein ratio and atrophy in certain brain regions as potential markers of these subtypes. Analyses of genetic and transcriptomic profiles with network-based approaches identified molecular modules associated with each subtype. For instance, the PD-R-specific module suggested STAT3 , FYN , BECN1 , APOA1 , NEDD4 , and GATA2 as potential driver genes of PD-R. It also suggested neuroinflammation, oxidative stress, metabolism, PI3K/AKT, and angiogenesis pathways as potential drivers for rapid PD progression (i.e., PD-R). Moreover, we identified repurposable drug candidates by targeting these subtype-specific molecular modules using network-based approach and cell line drug-gene signature data. We further estimated their treatment effects using two large-scale real-world patient databases; the real-world evidence we gained highlighted the potential of metformin in ameliorating PD progression. In conclusion, this work helps better understand clinical and pathophysiological complexity of PD progression and accelerate precision medicine.
In recent years, with the development of mobile ad hoc networks (MANETs) technology, various overlay multicast routing protocols have been proposed for MANETs. These protocols have distinguishing features and different applications. To provide a comprehensive understanding of these overlay multicast routing protocols and better organize existing ideas and work to facilitate overlay multicast routing design for MANETs, we simply present some research of these main kinds overlay multicast routing protocols. This paper aims to aid those MANETs researchers and application developers in selecting appropriate overlay multicast routing protocols for their work.
We propose a novel spectrum decision scheme (i.e., channel selection and handoff) for wireless mesh networks (WMN) which use multiple channels and nodes equipped with multi-beam directional antennas. Our scheme has the following features: (i) It performs spectrum decision by considering various WMN parameters, including the channel quality, beam orientation, antenna-caused deafness and capture effects, and application priority level. (ii) It uses the reinforcement learning (RL)-based spectrum decision process to achieve the optimal quality of multimedia transmission in the long term. However, a newly-joined WMN node could take a long time to make a correct spectrum decision due to the difficult choice of initial RL parameters. Therefore, our scheme uses the apprenticeship learning in conjunction with the RL model, to speed up the spectrum decision process by choosing a suitable neighboring node (called "expert") to teach a newly-joined node (called "apprentice"). Our experiments demonstrate that the proposed spectrum decision scheme improves the network performance and multimedia transmission quality.
Highly distributed data management platforms (e.g., PNUTS, Dynamo, Cassandra, and BigTable) are rapidly becoming the favorite choice for hosting modern web applications in the cloud. Among other features, these platforms rely on data replication and relaxed consistency to achieve high levels of performance and scalability. However, these design choices often exhibit a trade-off between performance SLA and data currency. In this paper, in addition to performance SLAs, we also perceive an application tolerance to data staleness as another requirement determining the end-user satisfaction and our goal is to strike a fine balance between both the quality of service and quality of data perceived by end-user. Towards that, we propose scheduling policies and mechanisms for efficiently allocating the recourses at each replica node so that to meet the conflicting requirements of user queries and replica updates. Our experimental results show that employing our scheduling schemes for resource allocation can provide significant improvements in the overall system utility when compared to existing policies.
This paper describes a computer vision based system for real-time robust traffic sign detection, tracking, and recognition. Such a framework is of major interest for driver assistance in an intelligent automotive cockpit environment. The proposed approach consists of two components. First, signs are detected using a set of Haar wavelet features obtained from AdaBoost training. Compared to previously published approaches, our solution offers a generic, joint modeling of color and shape information without the need of tuning free parameters. Once detected, objects are efficiently tracked within a temporal information propagation framework. Second, classification is performed using Bayesian generative modeling. Making use of the tracking information, hypotheses are fused over multiple frames. Experiments show high detection and recognition accuracy and a frame rate of approximately 10 frames per second on a standard PC.
Highly distributed data management platforms (e.g., PNUTS, Dynamo, Cassandra, and BigTable) are rapidly becoming the favorite choice for hosting modern web applications in the cloud. Among other features, these platforms rely on data partitioning, replication and relaxed consistency to achieve high levels of performance and scalability. However, these design choices often exhibit a trade-off between performance and data freshness. In this paper, in addition to performance SLAs, we also perceive an application tolerance to data staleness as another requirement determining the end-user satisfaction and our goal is to strike a fine balance between both the quality of service (QoS) and quality of data (QoD) perceived by the end-user. Towards that, we propose scheduling policies and mechanisms for efficiently allocating the recourses at each replica node so that to meet the conflicting requirements of user queries and replica updates. Our experimental results show that employing our scheduling strategies for resource allocation can provide significant improvements in the overall system utility when compared to the existing ones.
Network function virtualization (NFV) is critical to the scalability and flexibility of various network services in the form of service function chains (SFCs), which refer to a set of Virtual Network Functions (VNFs) chained in a specific order. However, the NFV performance is hard to fulfill the ever-increasing requirements of network services mainly due to the static orchestrations of SFCs. To tackle this issue, a novel Scalable SFC Orchestration (SSCO) scheme is proposed in this paper for NFV-enabled networks via federated reinforcement learning. SSCO has three remarkable characteristics distinguishing from the previous work: (1) A federated-learning-based framework is designed to train a global learning model, with time-variant local model explorations, for scalable SFC orchestration, while avoiding data sharing among stakeholders; (2) SSCO allows for parameter update among local clients and the cloud server just at the first and last epochs of each episode to ensure that distributed clients can make model optimization at a low communication cost; (3) SSCO introduces an efficient deep reinforcement learning (DRL) approach, with the local learning knowledge of available resources and instantiation cost, to map VNFs into networks flexibly. Furthermore, a loss-weight-based mechanism is proposed to generate and exploit reference samples in replay buffers for future training, avoiding the strong relevance of samples. Simulation results obtained from different working scenarios demonstrate that SSCO can significantly reduce placement errors and improve resource utilization ratio to place time-variant VNFs compared with the state-of-the-art mechanisms. Furthermore, the results show that the proposed approach can achieve desirable scalability.