This paper reviews the second NTIRE challenge on image dehazing (restoration of rich details in hazy image) with focus on proposed solutions and results. The training data consists from 55 hazy images (with dense haze generated in an indoor or outdoor environment) and their corresponding ground truth (haze-free) images of the same scene. The dense haze has been produced using a professional haze/fog generator that imitates the real conditions of haze scenes. The evaluation consists from the comparison of the dehazed images with the ground truth images. The dehazing process was learnable through provided pairs of haze-free and hazy train images. There were ~ 270 registered participants and 23 teams competed in the final testing phase. They gauge the state-of-the-art in image dehazing.
Informative sensing facilitates the effectiveness and efficiency for environmental monitoring. This paper investigates a deep learning approach that can estimate the optimal results of informative sensing in a random field. First, a Gaussian process (GP) is designated to characterize an environmental field. Then, mutual information over the covariance function of GP is utilized to define the near-optimal locations as the most informative sampling places in the field. At last, a deep neural network is developed to learn the intercorrelation between the covariance function and the the most informative sampling places, which can provide an near online planning of the MI optimization process. In this paper, the experimental results on a real-world dataset validate the proposed learning framework.
Unreasonable energy structure and socioeconomic activities are coupled with an adverse effect on air quality conditions in China. In this paper, Input-output analysis (IOA) and Ecological network analysis (ENA) are combined to investigate the embodied PM 2.5 emissions that result from the current monetary flows and energy structure. The results show that, in 2010, 34 percent of the total PM 2.5 emissions, or 59.4kt PM 2.5 , were indirect emissions traded through economic sectors within Beijing. Based on the results of ENA, we found that "Smelting & Pressing of Metals", "Metal Products" and "Nonmetal Mineral Products" are the top three sectors with the highest control levels while "Agriculture", "Catering Services" and "Residential Services" are the lowest-ranking sectors among the system. Distributing the indirect emissions to original sector and identifying the control relationship of each sector provide a new point of view on formulating impartial and effective polices of alleviating the air pollution.
Deepfake techniques can forge the visual or audio signals in the video, which leads to inconsistencies between visual and audio (VA) signals. Therefore, multimodal detection methods expose deepfake videos by extracting VA inconsistencies. Recently, deepfake technology has started VA collaborative forgery to obtain more realistic deepfake videos, which poses new challenges for extracting VA inconsistencies. Recent multimodal detection methods propose to first extract natural VA correspondences in real videos in a self-supervised manner, and then use the learned real correspondences as targets to guide the extraction of VA inconsistencies in the subsequent deepfake detection stage. However, the inherent VA relations are difficult to extract due to the modality gap, which leads to the limited auxiliary performance of the aforementioned self-supervised methods. In this paper, we propose Predictive Visual-audio Alignment Self-supervision for Multimodal Deepfake Detection (PVASS-MDD), which consists of PVASS auxiliary and MDD stages. In the PVASS auxiliary stage in real videos, we first devise a three-stream network to associate two augmented visual views with corresponding audio clues, leading to explore common VA correspondences based on cross-view learning. Secondly, we introduce a novel cross-modal predictive align module for eliminating VA gaps to provide inherent VA correspondences. In the MDD stage, we propose to the auxiliary loss to utilize the frozen PVASS network to align VA features of real videos, to better assist multimodal deepfake detector for capturing subtle VA inconsistencies. We conduct extensive experiments on existing widely used and latest multimodal deepfake datasets. Our method obtains a significant performance improvement compared to state-of-the-art methods.
Fine particulate matter (PM2.5) can be a major hazardous constituent of air pollution in ambient environment. Previous studies have proved that fossil fuel combustion is the main source of PM2.5 emissions. However, little is known for the socioeconomic factors driving the growth of PM2.5 emissions, which has been hidden from the public due to the geographical separation of production and consumption activities. In this paper, we investigate the characteristics and drivers of PM2.5 within China based on multi-regional input-output analysis (MRIO). MRIO is applied to allocate the physical PM2.5 emissions along the supply chains to the final consumers. A series of socioeconomic factors are selected to diagnose the driving forces of PM2.5 emissions. This examination of drivers for China's PM2.5 emission is essential for pollution mitigation by guiding policy-making and targets of technology development.
Skeleton-based human action recognition has attracted increasing attention in recent years. However, most of the existing works focus on supervised learning which requiring a large number of annotated action sequences that are often expensive to collect. We investigate unsupervised representation learning for skeleton action recognition, and design a novel skeleton cloud colorization technique that is capable of learning skeleton representations from unlabeled skeleton sequence data. Specifically, we represent a skeleton action sequence as a 3D skeleton cloud and colorize each point in the cloud according to its temporal and spatial orders in the original (unannotated) skeleton sequence. Leveraging the colorized skeleton point cloud, we design an auto-encoder framework that can learn spatial-temporal features from the artificial color labels of skeleton joints effectively. We evaluate our skeleton cloud colorization approach with action classifiers trained under different configurations, including unsupervised, semi-supervised and fully-supervised settings. Extensive experiments on NTU RGB+D and NW-UCLA datasets show that the proposed method outperforms existing unsupervised and semi-supervised 3D action recognition methods by large margins, and it achieves competitive performance in supervised 3D action recognition as well.
This paper addresses the problem of 3D hand pose estimation from a monocular RGB image. While previous methods have shown their success, the structure of hands has not been exploited explicitly, which is critical in pose estimation. To this end, we propose a hand-model regularized graph refinement paradigm under an adversarial learning framework, aiming to explicitly capture structural inter-dependencies of hand joints for the learning of intrinsic patterns. We estimate an initial hand pose from a parametric hand model as a prior of hand structure, and refine the structure by learning the deformation of the prior pose via residual graph convolution. To optimize the hand structure further, we propose two bone-constrained loss functions, which characterize the morphable structure of hand poses explicitly. Also, we introduce an adversarial learning framework with a multi-source discriminator to capture structural features, which imposes the constraints onto the distribution of generated 3D hand poses for anthropomorphically valid hand poses. Extensive experiments demonstrate that our model sets the new state-of-the-art in 3D hand pose estimation from a monocular image on standard benchmarks.
Purpose This study delves into the epidemiology of high-risk human papillomavirus (HR-HPV) infection and its link to precancerous lesions among perimenopausal (40-59 years) and elderly (60-65 years) women in a Chinese county with a notably high incidence of cervical cancer. By uniquely focusing on these age groups in underdeveloped regions, the research aims to offer novel strategies for the management and prevention of cervical cancer. It seeks to inform targeted interventions and public health policies that could significantly benefit women at heightened risk for HPV, addressing a critical gap in current prevention efforts in economically disadvantaged communities. Methods This observational study was conducted at the Maternal and Child Health and Family Planning Service Centre in Lueyang County, from September 2021 to January 2022. It assessed 2008 women aged 40-65 for HPV screening, with 342 undergoing further cytological examination. The study evaluated the prevalence of HPV infection across different age groups and risk categories. It utilized a questionnaire to collect participants' basic information, health behaviors, and other relevant data to analyze factors influencing HR-HPV infection. Statistical analyses comprised chi-square tests, trend analysis, logistic regression, and multiple imputation techniques to address missing data. Results The prevalence of HR-HPV infection among women aged 40-65 years in Lueyang County was 18.43%. Older women exhibited a higher incidence of HPV infection, abnormal ThinPrep Cytology Test (TCT) results (Shaanxi Fu'an Biotechnology Co. Ltd., Baoji City, China), and low/high-grade squamous intraepithelial lesions (LSIL/HSIL) (P<0.05). The most prevalent HR-HPV genotypes in the overall, perimenopausal, and elderly groups were HPV-52, -53, and -58; HPV-52, -53, and -16; and HPV-58, -52, and -53, respectively. The prevalent HR-HPV genotypes in the abnormal The Bethesda System (TBS) results were HPV-16, -52, -33, -58; -16, -52, -58; and-16, -33, and -52. HPV-16, -18, -33 prevalence increased with increasing lesion severity (P<0.05). In this study, factors affecting HR-HPV in the three age groups were found to be mainly related to sexual behavior and education level, including history of lower genital tract diseases, multiple pregnancies, contraceptive methods without tubal ligation, age at first marriage greater than 18 years, never washing the vulva after sex, abstinence from sex, education level of junior high school or above, and spouse's education level of high school or above. Conclusions These findings suggest that the elevated rate of abnormal TBS in the older age group may be attributed to the higher prevalence of persistent infection-prone HR-HPV genotypes (HPV-58, -52, and-53), multiple infections, and potent oncogenic HR-HPV genotypes (HPV-16 and -33). Additionally, the higher HR-HPV prevalence in older patients may be related to lower education attainment, reduced screening rate, and limited condom usage. Therefore, strategies targeting perimenopausal and older women should prioritize enhancing health awareness, increasing screening rates, and encouraging condom utilization.
This study explores the dynamic deflection and vibrational analysis of a spherical shell, modeled as a football game ball, reinforced with graphene platelet nanocomposites (GPLs). The analysis leverages the Carrera Unified Formulation (CUF) for accurate and efficient modeling of the mechanical behavior of the shell under dynamic loads. CUF's flexibility in adapting to complex geometries and material properties is utilized to represent the heterogeneous reinforcement of GPLs within the spherical shell structure. To enhance the reliability of the computational results, a hybrid artificial intelligence (AI) framework is implemented for result verification. This framework integrates Convolutional Neural Networks (CNNs) for spatial data representation with ReliefF feature selection to identify and prioritize influential variables. The hybrid AI system ensures robust predictive modeling, addressing the high-dimensional nature of the problem domain. The study also delves into the implications of graphene reinforcement on the ball's performance, focusing on factors such as deformation under load, vibrational response, and stability thresholds. The results indicate that GPL reinforcement significantly improves the dynamic stability and stiffness of the spherical shell. Comparative analyses validate the efficiency of the CUF-based computational approach through mathematics benchmarks and AI-verified predictions. This interdisciplinary work highlights the potential of combining advanced computational mechanics, nanomaterials, and AI-driven verification in optimizing dynamic stability for applications in sports engineering and beyond.