In order to obtain global contextual information precisely from videos with heavy camera motions and scene changes, this study proposes an improved spatiotemporal two-stream neural network architecture with a novel convolutional fusion layer. The three main improvements of this study are: 1) the Resnet-101 network has been integrated into the two streams of the target network independently; 2) two kinds of feature maps (i.e., the optical flow motion and RGB-channel information) obtained by the corresponding convolution layer of two streams respectively are superimposed on each other; 3) the temporal information is combined with the spatial information by the integrated three-dimensional (3D) convolutional neural network (CNN) to extract more latent information from the videos. The proposed approach was tested by using UCF-101 and HMDB51 benchmarking datasets and the experimental results show that the proposed two-stream 3D CNN model can gain substantial improvement on the recognition rate in video-based analysis.
Most of the existing image fusion methods prefer to use the adversarial learning game to fuse infrared and visible imagess. However, such single adversarial mechanism makes image fusion task easily ignore global contextual information. To this end, this paper proposes a CNN-Transformer dual-process-based generative adversarial network (CTDPGAN) to fuse infrared and visible images. In generator, a dual-process-based module composed by a CNN block and a Swin-Transformer block is proposed. The channel filter and spatial filter in the CNN block has the ability to adaptively extract additional complementary information from images of various modalities while preserving the shallow features of the source images. The Swin-Transformer Block (STRB) is designed to establish local attention by dividing non-overlapping windows and then to bridge global attention by interacting windows. In addition, we introduce generative adversarial learning networks into the training process, the dual-channel transformer discriminators are designed to improve the discriminative ability of the fused image. Thus, the fused image learns the distribution of global contextual information from source images and retain competitive visible light and infrared domains in more balanced manner. Moreover, we introduce the primary and auxiliary feature concepts into the structural similarity loss function and spatial frequency loss function, which will enable the generator to produce a fused image that retains thermal radiation information and rich detail information. Finally, the experimental findings demonstrate that, in both subjective and objective assessments, our model produces outcomes that are equivalent to or superior compared to state-of-the-art image fusion methods.
Abstract Surface water is the most dynamic land‐cover type. Transitions between water and nonwater types (such as vegetation and ice) can happen momentarily. More frequent mapping is necessary to study the changing patterns of water. However, monitoring of long‐term global water changes at high spatial resolution and in high temporal frequency is challenging. Here we report the generation of a daily global water map data set at 500‐m resolution from 2001 to 2016 based on the daily reflectance time series from Moderate Resolution Imaging Spectroradiometer. Each single‐date image is classified into three types: water, ice/snow, and land. Following temporal consistency check and spatial‐temporal interpolation for missing data, we conducted a series of validation of the water data set. The producer's accuracy and user's accuracy are 94.61% and 93.57%, respectively, when checked with classification results derived from 30‐m resolution Landsat images. Both the producer's accuracy and user's accuracy reached better than 90% when compared with manually interpreted large‐sized sample units (≥1,000 m × 1,000 m) collected in a previous global land cover mapping project. Generally, the global inland water area reaches its maximum (~3.80 × 10 6 km 2 ) in September and minimum (~1.50 × 10 6 km 2 ) in February during an annual cycle. Short‐duration water bodies, sea level rise effects, different types of rice field use can be detected from the daily water maps. The size distribution of global water bodies is also discussed from the perspective of the number of water bodies and the corresponding water area. In addition, the daily water maps can precisely reflect water freezing and help correct water areas with inconsistent cloud flags in the MOD09GA quality assessment layer.
On the basis of the exponential model and Logistic model presented at the previous, using improved entropy method to calculate each model weight coefficient to establish a new combination model for prediction foundation settlement, to achieve the goal of comprehensive advantages of both models and smaller prediction error.Respectively using exponential model, Logistic model and a combination of both a model to process power plant foundation settlement data, and with squared error, mean absolute error, mean absolute percentage error, mean square error for each model to evaluate the prediction accuracy.The results show that Prediction error of combination model established by improved entropy method is less than a single forecast model.The combination model has Higher prediction accuracy and effectively predict the power plant foundation settlement.
Purpose: In recent years, mental health problems have become the most serious social problems worldwide. Past studies have proposed that some links exist between sunlight and mental health; however, relevant studies examining low-dose sunlight exposure populations are lacking. We conducted a study among a group of operating room nurses (ORNs) who work long hours in operating rooms and have limited sunlight exposure. We aim to add to and refine previous researches on the association between mental health and sunlight exposure in community population. Patients and Methods: A total of 787 ORNs were interviewed and analyzed. Mental health, sunlight exposure duration, sociodemographic and work-related variables, and chronic diseases were evaluated. The Kessler 10 scale (K10) was used to assess participants’ mental health status, and their sunlight exposure duration was assessed using their self-reports. Multiple linear regression analysis was adopted to examine the association between sunlight exposure and mental health. Results: The average K10 score of ORNs was 25.41. ORNs exhibit poorer mental health than other populations. Poor mental health was negatively associated with greater sunlight exposure hours per day (β=− 0.378) and sleep regularity (β=− 3.341). Poor mental health was positively associated with chronic disease (β=3.514). Conclusion: This study indicated that the positive association between sunlight exposure and mental health existed. Appropriate enhancement of sunlight exposure will be beneficial to mental health. Hospitals, related organizations and individuals should pay greater attention to ORNs’ mental health and sunlight exposure conditions. More policy recommendations as well as building structure recommendations should be proposed. Keywords: sunlight exposure, mental health, operating room nurses, China
Scene text image super-resolution (STISR) has been regarded as an important pre-processing task for text recognition from low-resolution scene text images. Most recent approaches use the recognizer's feedback as clues to guide super-resolution. However, directly using recognition clue has two problems: 1) Compatibility. It is in the form of probability distribution, has an obvious modal gap with STISR - a pixel-level task; 2) Inaccuracy. it usually contains wrong information, thus will mislead the main task and degrade super-resolution performance. In this paper, we present a novel method C3-STISR that jointly exploits the recognizer's feedback, visual and linguistical information as clues to guide super-resolution. Here, visual clue is from the images of texts predicted by the recognizer, which is informative and more compatible with the STISR task; while linguistical clue is generated by a pre-trained character-level language model, which is able to correct the predicted texts. We design effective extraction and fusion mechanisms for the triple cross-modal clues to generate a comprehensive and unified guidance for super-resolution. Extensive experiments on TextZoom show that C3-STISR outperforms the SOTA methods in fidelity and recognition performance. Code is available in https://github.com/zhaominyiz/C3-STISR.