Mechanical site preparation by mounding is often used by the forest industry to provide optimal growth conditions for tree seedlings. Prior to planting, an essential step consists in estimating the number of mounds at each planting block, which serves as planting microsites. This task often requires long and costly field surveys, implying several forestry workers to perform manual counting procedure. This paper addresses the problem of automating the counting process using computer vision and UAV imagery. We present a supervised detection-based counting framework for estimating the number of planting microsites on a mechanically prepared block. The system is trained offline to learn feature representations from semi-automatically annotated images. Mound detection and counting are then performed on multispectral UAV images captured at an altitude of 100 m. Our detection framework proceeds by generating region proposals based on local binary patterns (LBP) features extracted from near-infrared (NIR) patches. A convolutional neural network (CNN) is then used for classifying candidate regions by considering multispectral image data. To train and evaluate the proposed method, we constructed a new dataset by capturing aerial images from different planting blocks. The results demonstrate the efficiency and validity of the proposed method under challenging experimental conditions. The methods and results presented in this paper form a promising cornerstone to develop advanced decision support systems for planning planting operations.
This paper presents a novel computer vision method to measure the breathing pattern in intensive care environment. The proposed system uses depth information captured by two RGB-D cameras in order to reconstruct a 3D surface of a patient's torso with a high spatial coverage. The optimal positioning for the sensors is a key step to perform an accurate 3D reconstruction without interfering with patient care. In this context, our hardware setup meets the clinical requirements while allowing accurate estimation of respiratory parameters including respiratory rate, tidal volume and inspiratory time. Our system provides the motion information not only for the top of the torso surface but also for its both lateral sides. Our method was tested in an environment designed for critically ill children, where it was compared to the gold standard method currently used in intensive care units. The performed experiments yielded high accuracy and showed significant agreement with gold standard method.
Planting by mounding is a commonly used forestry technique that improves soil quality and ensures optimal tree growth conditions. During planting operations, one of the main planning steps is to estimate the number of mechanically created mounds in each planting block. Traditional counting methods involve manual field surveys or human photo-interpretation of UAV images, which are generally subject to errors and time-consuming. In this work, we propose a new approach to count mounds on UAV orthomosaics. Our framework is designed to estimate the required number of seedlings for a given planting block, based on a visual detection approach and a global estimation module. Firstly, a deep local detection model is applied on local patches to recognize and count visible mounds. Then, an estimation model, based on global features is used to predict the final number of plant seedling required for a given plantation block. To evaluate the proposed framework in real-world conditions, we constructed a large UAV dataset, including 18 UAV orthomosaics, comprising 111,000 mounds. We have conducted extensive experiments in our dataset, including a comparison with the state-of-the-art counting methods, as well as an analysis of Human-Level Performance (HLP) in identifying and annotating mounds. The experimental results show that our model reaches the best performance in terms of MAE and MSE, by comparison to state-of-the-art automatic counting mehtods.
Site preparation by mounding is a commonly used silvicultural treatment that improves tree growth conditions by mechanically creating planting microsites called mounds. Following site preparation, the next critical step is to count the number of mounds, which provides forest managers with a precise estimate of the number of seedlings required for a given plantation block. Counting the number of mounds is generally conducted through manual field surveys by forestry workers, which is costly and prone to errors, especially for large areas. To address this issue, we present a novel framework exploiting advances in Unmanned Aerial Vehicle (UAV) imaging and computer vision to accurately estimate the number of mounds on a planting block. The proposed framework comprises two main components. First, we exploit a visual recognition method based on a deep learning algorithm for multiple object detection by pixel-based segmentation. This enables a preliminary count of visible mounds, as well as other frequently seen objects (e.g. trees, debris, accumulation of water), to be used to characterize the planting block. Second, since visual recognition could limited by several perturbation factors (e.g. mound erosion, occlusion), we employ a machine learning estimation function that predicts the final number of mounds based on the local block properties extracted in the first stage. We evaluate the proposed framework on a new UAV dataset representing numerous planting blocks with varying features. The proposed method outperformed manual counting methods in terms of relative counting precision, indicating that it has the potential to be advantageous and efficient in difficult situations.
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in computer vision for many years. However, it is very difficult to assess the progress that has been made on this topic because there is no standard evaluation methodology. The difficulty in evaluating PTZ tracking algorithms arises from their dynamic nature. In contrast to other forms of tracking, PTZ tracking involves both locating the target in the image and controlling the motors of the camera to aim it so that the target stays in its field of view. This type of tracking can only be performed online. In this paper, we propose a new evaluation framework based on a virtual PTZ camera. With this framework, tracking scenarios do not change for each experiment and we are able to replicate online PTZ camera control and behavior including camera positioning delays, tracker processing delays, and numerical zoom. We tested our evaluation framework with the Camshift tracker to show its viability and to establish baseline results.
Urban traffic environments present unique challenges for object detection, particularly with the increasing presence of micromobility vehicles like e-scooters and bikes. To address this object detection problem, this work introduces an adapted detection model that combines the accuracy and speed of single-frame object detection with the richer features offered by video object detection frameworks. This is done by applying aggregated feature maps from consecutive frames processed through motion flow to the YOLOX architecture. This fusion brings a temporal perspective to YOLOX detection abilities, allowing for a better understanding of urban mobility patterns and substantially improving detection reliability. Tested on a custom dataset curated for urban micromobility scenarios, our model showcases substantial improvement over existing state-of-the-art methods, demonstrating the need to consider spatio-temporal information for detecting such small and thin objects. Our approach enhances detection in challenging conditions, including occlusions, ensuring temporal consistency, and effectively mitigating motion blur.
This paper presents a novel approach for visible-thermal infrared stereoscopy, focusing on the estimation of disparities of human silhouettes. Visible-thermal infrared stereo poses several challenges, including occlusions and differently textured matching regions in both spectra. Finding matches between two spectra with varying colors, textures, and shapes adds further complexity to the task. To address the aforementioned challenges, this paper proposes a novel approach where a high-resolution convolutional neural network is used to better capture relationships between the two spectra. To do so, a modified HRNet backbone is used for feature extraction. This HRNet backbone is capable of capturing fine details and textures as it extracts features at multiple scales, thereby enabling the utilization of both local and global information. For matching visible and thermal infrared regions, our method extracts features on each patch using two modified HRNet streams. Features from the two streams are then combined for predicting the disparities by concatenation and correlation. Results on public datasets demonstrate the effectiveness of the proposed approach by improving the results by approximately 18 percentage points on the $\leq$ 1 pixel error, highlighting its potential for improving accuracy in this task. The code of VisiTherS is available on GitHub at the following link https://github.com/philippeDG/VisiTherS.
Site preparation by mounding is a commonly used silvicultural treatment that improves tree growth conditions by mechanically creating planting microsites called mounds. Following site preparation, the next critical step is to count the number of mounds, which provides forest managers with a precise estimate of the number of seedlings required for a given plantation block. Counting the number of mounds is generally conducted through manual field surveys by forestry workers, which is costly and prone to errors, especially for large areas. To address this issue, we present a novel framework exploiting advances in Unmanned Aerial Vehicle (UAV) imaging and computer vision to estimate the number of mounds on a planting block accurately. The proposed framework comprises two main components. First, we exploit a visual recognition method based on a deep learning algorithm for multiple object detection by segmentation. This enables a preliminary counting of visible mounds, as well as other frequently seen objects (e.g., trees, debris, accumulation of water), to be used to characterize the planting block. Second, since visual recognition could be limited by several perturbation factors (e.g., mound erosion, occlusion), we employ a machine learning estimation function to predict the final number of mounds based on the local block properties extracted in the first stage. We evaluate the proposed framework on a new UAV dataset representing numerous planting blocks with varying features. The proposed method outperformed manual counting methods in terms of relative counting precision, indicating that it has the potential to be advantageous and efficient under challenging situations.