The English East India Company came to India solely for the purpose of business. Only after the Battle of Plassey in 1757, the British started to implement the ideology of Imperialism in India by gradually expanding its political influences over India and making it their colony. With this, the British sought to reshape the socio economic structure of the Indian society. In order to meet up the home charges, expenses of the company and its officials, they devised and implemented various policies to extract as much revenue as possible. Since agriculture was the primary source of revenue, it was on the shoulder of the Indian peasants that the burden of increased taxes was imposed. In the wake of these obligations, major peasant uprisings occurred which were not unnatural. Thus the history of India during the 200 years of British rule is embedded with numerous peasant uprisings, which mobilised thousands of peasants and affected the entire society.
Automatically identifying violence in videos is critical, and combining visual and audio cues is often the most effective approach that provides complementary information for violence detection. However, existing research on fusing these cues is computationally demanding and limited. To address this issue, we propose a novel fused vision-based graph neural network (FV-GNN) for violence detection using audiovisual information. This approach combines local and global features from both audio and video, leveraging a residual learning strategy to extract the most informative cues. Furthermore, FV-GNN utilizes dynamic graph filtering to analyze the inherent relationships between audio and video samples, enhancing violence recognition. The network consists of three branches: integrated, specialized, and scoring. The integrated branch captures long-range dependencies based on similarity, while the specialized branch focuses on local positional relationships. Finally, the scoring branch assesses the predicted violence likelihood against reality. We extensively explored the use of graphs for modeling temporal context in videos and found FV-GNN to be particularly well-suited for real-time violence detection. Our experiments demonstrate that FV-GNN outperforms current state-of-the-art methods on the XD-Violence datasets.
Abstract Detection of violence and weaponized violence in closed-circuit television (CCTV) footage requires a comprehensive approach. In this work, we introduce the Smart-City CCTV Violence Detection (SCVD) dataset, specifically designed to facilitate the learning of weapon distribution in surveillance videos. To tackle the complexities of analyzing 3D surveillance video for violence recognition tasks, we propose a novel technique called, SSIVD-Net (Salient-Super-Image for Violence Detection). Our method reduces 3D video data complexity, dimensionality, and information loss while improving inference, performance, and explainability through the use of Salient-Super-Image representations. Considering the scalability and sustainability requirements of futuristic smart cities, the authors introduce the Salient-Classifier, a novel architecture combining a kernelized approach with a residual learning strategy. We evaluate variations of SSIVD-Net and Salient Classifier on our SCVD dataset and benchmark against state-of-the-art (SOTA) models commonly employed in violence detection. Our approach exhibits significant improvements in detecting both weaponized and non-weaponized violence instances. By advancing the SOTA in violence detection, our work offers a practical and scalable solution suitable for real-world applications. The proposed methodology not only addresses the challenges of violence detection in CCTV footage but also contributes to the understanding of weapon distribution in smart surveillance. Ultimately, our research findings should enable smarter and more secure cities, as well as enhance public safety measures.
Detecting violence in surveillance videos is crucial in activity recognition, with wide-ranging applications in unmanned aerial vehicles (UAVs), internet video filtering, and related domains. This study proposed a highly effective deep learning architecture that employs a two-stream approach, combining a 3D convolution network with a merging module for violence detection. One stream analyzes RGB frames with suppressed background, while the other focuses on the optical flow between corresponding frames. These inputs are precious in identifying violent actions often characterized by distinctive body movements. To ensure robust long-range feature extraction with fewer parameters, we replace the 3D depth-wise convolution operation at each layer instead of the conventional 3D. Our model outperforms existing methods on challenging datasets such as RWF2000, Real-Life Violence Situation (RLVS), and Movie Fight, securing state-of-the-art results. Our experiments demonstrate that the proposed model is well-suited for edge devices, offering computational efficiency and precise detection capabilities.
<p>According to the World Health Organization, approximately 6 in 100 people suffer from Post-traumatic stress disorder (PTSD) at some point in their lives. PTSD is a mental disorder with a set of symptoms observed in a person following exposure to a life-threatening event or witness a death. The primary characteristic of PTSD is the persistence over time of certain symptoms: flashbacks, traumatic nightmares, acute stress and symptoms of depression. In a stressful situation, the whole body goes into tension. This has an energy cost and affects the voice, breathing, daily gestures and the face and which are referred to in this survey paper as external symptoms.</p> <p>The diagnosis of PTSD is based on the Diagnostic and Statistical Manual of Mental Disorders (also known by the acronym DSM) classification using standard questionnaires where the patient self-reports their condition based on their symptoms. Indeed, these questionnaires have several limitations and give an imprecise diagnosis. </p> <p>Sensors and wearable based technologies can play a key role in improving the diagnosis, the prognosis and the assistance of PTSD. Recently, Computer-Aided Diagnosis (CAD) systems have been proposed for PTSD detection based on external symptoms and brain activity using video and EEG sensors. To the best of our knowledge, this is the first survey paper that gives a literature overview of machine learning based approaches for PTSD diagnosis using video and EEG sensors. In addition, a comparison between existing approaches and a discussion about the potential avenues for future PTSD research is provided.</p>
In this work, we introduce a novel methodology, "SpotCrack" designed to precisely segment road surfaces and infrastructure cracks. We employ lightweight modules to effectively identify various types of potential cracks in the infrastructure. The effectiveness of this approach is systematically evaluated on a diverse set of crack images, thereby substantiating its capacity to enhance operational efficiency within road and infrastructure management tasks. Notably, SpotCrack achieves real-time processing at an impressive rate of 130 frames per second (FPS) with an approximate inference time of 8 milliseconds, enabling its versatile applications in various real-world scenarios. This accelerated processing capability positions SpotCrack for diverse deployments, including integration with Unmanned Aerial Vehicles (UAVs) and road maintenance fleets. Significantly, SpotCrack applicability extends to individual mobility, as its high FPS empowers its integration within private vehicles, marking a transformative step in road safety, benefiting both self-driving and conventional vehicles alike. These findings underscore SpotCrack pivotal role in delivering precise crack segmentation, driving advancements in road management practices, and reinforcing road safety endeavors.
Around the world, agriculture is one of the important sectors of human life in terms of food, business, and employment opportunities. In the farming field, wheat is the most farmed crop but every year, its ultimate production is badly influenced by various diseases. On the other hand, early and precise recognition of wheat plant diseases can decrease damage, resulting in a greater yield. Researchers have used conventional and Machine Learning (ML)-based techniques for crop disease recognition and classification. However, these techniques are inaccurate and time-consuming due to the unavailability of quality data, inefficient preprocessing techniques, and the existing selection criteria of an efficient model. Therefore, a smart and intelligent system is needed which can accurately identify crop diseases. In this paper, we proposed an efficient ML-based framework for various kinds of wheat disease recognition and classification to automatically identify the brown- and yellow-rusted diseases in wheat crops. Our method consists of multiple steps. Firstly, the dataset is collected from different fields in Pakistan with consideration of the illumination and orientation parameters of the capturing device. Secondly, to accurately preprocess the data, specific segmentation and resizing methods are used to make differences between healthy and affected areas. In the end, ML models are trained on the preprocessed data. Furthermore, for comparative analysis of models, various performance metrics including overall accuracy, precision, recall, and F1-score are calculated. As a result, it has been observed that the proposed framework has achieved 99.8% highest accuracy over the existing ML techniques.
Variability in staining protocols, such as different slide preparation techniques, chemicals, and scanner configurations, can result in a diverse set of whole slide images (WSIs). This distribution shift can negatively impact the performance of deep learning models on unseen samples, presenting a significant challenge for developing new computational pathology applications. In this study, we propose a method for improving the generalizability of convolutional neural networks (CNNs) to stain changes in a single-source setting for semantic segmentation. Recent studies indicate that style features mainly exist as covariances in earlier network layers. We design a channel attention mechanism based on these findings that detects stain-specific features and modify the previously proposed stain-invariant training scheme. We reweigh the outputs of earlier layers and pass them to the stain-adversarial training branch. We evaluate our method on multi-center, multi-stain datasets and demonstrate its effectiveness through interpretability analysis. Our approach achieves substantial improvements over baselines and competitive performance compared to other methods, as measured by various evaluation metrics. We also show that combining our method with stain augmentation leads to mutually beneficial results and outperforms other techniques. Overall, our study makes significant contributions to the field of computational pathology.