Algorithms for the estimation of epipolar geometry from a pair of images have been very successful in dealing with challenging wide baseline images. In this paper the problem of scenes with repeated structures is addressed, dealing with the common case where the overlap between the images consists mainly of facades of a building. These facades may contain many repeated structures that can not be matched locally, causing state-of-the-art algorithms to fail. Assuming that the repeated structures lie on a planar surface in an ordered fashion the goal is to match them. Our algorithm first rectifies the images such that the facade is fronto-parallel. It then clusters similar features in each of the two images and matches the clusters. From them a set of hypothesized homographies of the facade is generated, using local groups of features. For each homography the epipole is recovered, yielding a fundamental matrix. For the best solution, it then decides whether the fundamental matrix has been recovered reliably and, if not, returns only the homography. The algorithm has been tested on a large number of challenging image pairs of buildings from the benchmark ZuBuD database, outperforming several state-of-the-art algorithms.
Affective computing for animals is a rapidly expanding research area that is going deeper than automated movement tracking to address animal internal states, like pain and emotions. Facial expressions can serve to communicate information about these states in mammals. However, unlike human-related studies, there is a significant shortage of datasets that would enable the automated analysis of animal facial expressions. Inspired by the recently introduced Cat Facial Landmarks in the Wild dataset, presenting cat faces annotated with 48 facial anatomy-based landmarks, in this paper, we develop an analogous dataset containing 3,274 annotated images of dogs. Our dataset is based on a scheme of 46 facial anatomy-based landmarks. The DogFLW dataset is available from the corresponding author upon a reasonable request.
This paper discusses the problem of inserting 3D models into a single image. The main focus of the paper is on the accurate recovery of the camera's parameters, so that 3D models can be inserted in the “correct” position and orientation. The paper addresses two issues. The first is an automatic extraction of the principal vanishing points from an image. The second is a theoretical and an experimental analysis of the errors. To test the concept, a system that “plants” virtual 3D objects in the image was implemented. It was tested on many indoor augmented-reality scenes. Our analysis and experiments have shown that errors in the placement of the objects are unnoticeable.
In this work, we recover the 3D shape of mirrors, sunglasses, and stainless steel implements. A computer monitor displays several images of parallel stripes, each image at a different angle. Reflections of these stripes in a mirroring surface are captured by the camera. For every image point, the direction of the displayed stripes and their reflections in the image are related by a 1D homography matrix, computed with a robust version of the statistically accurate heteroscedastic approach. By focusing on a sparse set of image points for which monitor-image correspondence is computed, the depth and the local shape may be estimated from these homographies. The depth estimation relies on statistically correct minimization and provides accurate, reliable results. Even for the image points where the depth estimation process is inherently unstable, we are able to characterize this instability and develop an algorithm to detect and correct it. After correcting the instability, dense surface recovery of mirroring objects is performed using constrained interpolation, which does not simply interpolate the surface depth values but also uses the locally computed 1D homographies to solve for the depth, the correspondence, and the local surface shape. The method was implemented and the shape of several objects was densely recovered at submillimeter accuracy.
Image matching is an essential task in many computer vision applications. It is obvious that thorough utilization of all available information is critical for the success of matching algorithms. However most popular matching methods do not incorporate effectively photometric data. Some algorithms are based on geometric, color invariant features, thus completely neglecting available photometric information. Others assume that color does not differ significantly in the two images; that assumption may be wrong when the images are not taken at the same time, for example when a recently taken image is compared with a database. This paper introduces a method for using color information in image matching tasks. Initially the images are segmented using an off-the-shelf segmentation process (EDISON). No assumptions are made on the quality of the segmentation. Then the algorithm employs a model for natural illumination change to define the probability of two segments to originate from the same surface. When additional information is supplied (for example suspected corresponding point features in both images), the probabilities are updated. We show that the probabilities can easily be utilized in any existing image matching system. We propose a technique to make use of them in a SIFT-based algorithm. The technique’s capabilities are demonstrated on real images, where it causes a significant improvement in comparison with the original SIFT results in the percentage of correct matches found.
We address the problem of rating or comparing navigation algorithms, or more generally navigation packages. For a given environment a navigation package consists of a motion planner and a sensor to be used during navigation. The ability to rate or measure a navigation package is important in order to address issues like sensor customization for an environment and choice of a motion planner in an environment. We develop a framework under which we can rate a given navigation package. Based on the navigation package, a partially observable Markov decision process (POMDP) is defined. Next an optimal policy to be used in this POMDP is searched for. The performance achieved under the resulting policy serves to measure the navigation package. The paper presents the motivations for solving the problem, the model we use and the framework which we have developed.
Puzzle solving is a difficult problem in its own right, even when the pieces are all square and build up a natural image. But what if these ideal conditions do not hold? One such application domain is archaeology, where restoring an artifact from its fragments is highly important. From the point of view of computer vision, archaeological puzzle solving is very challenging, due to three additional difficulties: the fragments are of general shape; they are abraded, especially at the boundaries (where the strongest cues for matching should exist); and the domain of valid transformations between the pieces is continuous. The key contribution of this paper is a fully-automatic and general algorithm that addresses puzzle solving in this intriguing domain. We show that our state-of-the-art approach manages to correctly reassemble dozens of broken artifacts and frescoes.