We present an example-based surface reconstruction method for scanned point sets. Our approach uses a database of local shape priors built from a set of given context models that are chosen specifically to match a specific scan. Local neighborhoods of the input scan are matched with enriched patches of these models at multiple scales. Hence, instead of using a single prior for reconstruction, our method allows specific regions in the scan to match the most relevant prior that fits best. Such high confidence matches carry relevant information from the prior models to the scan, including normal data and feature classification, and are used to augment the input point-set. This allows to resolve many ambiguities and difficulties that come up during reconstruction, e.g., distinguishing between signal and noise or between gaps in the data and boundaries of the model. We demonstrate how our algorithm, given suitable prior models, successfully handles noisy and under-sampled point sets, faithfully reconstructing smooth regions as well as sharp features.
Harmonic colors are sets of colors that are aesthetically pleasing in terms of human visual perception. In this paper, we present a method that enhances the harmony among the colors of a given photograph or of a general image, while remaining faithful, as much as possible, to the original colors. Given a color image, our method finds the best harmonic scheme for the image colors. It then allows a graceful shifting of hue values so as to fit the harmonic scheme while considering spatial coherence among colors of neighboring pixels using an optimization technique. The results demonstrate that our method is capable of automatically enhancing the color "look-and-feel" of an ordinary image. In particular, we show the results of harmonizing the background image to accommodate the colors of a foreground image, or the foreground with respect to the background, in a cut-and-paste setting. Our color harmonization technique proves to be useful in adjusting the colors of an image composed of several parts taken from different sources.
Creating a layout for an augmented reality (AR) application which embeds virtual objects in a physical environment is difficult as it must adapt to any physical space. We propose a rule-based framework for generating object layouts for AR applications. Under our framework, the developer of an AR application specifies a set of rules (constraints) which enforce self-consistency (rules regarding the inter-relationships of application components) and scene-consistency (application components are consistent with the physical environment they are placed in). When a user enters a new environment, we create, in real-time, a layout for the application, which is consistent with the defined constraints (as much as possible). We find the optimal configurations for each object by solving a constraint-satisfaction problem. Our stochastic move making algorithm is domain-aware, and allows us to efficiently converge to a solution for most rule-sets. In the paper we demonstrate several augmented reality applications that automatically adapt to different rooms and changing circumstances in each room.
We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images. The proposed approach has three key ingredients: (1) 3D object models are rendered in 3D models of complete scenes with realistic materials and lighting, (2) plausible geometric configuration of objects and cameras in a scene is generated using physics simulations, and (3) high photorealism of the synthesized images achieved by physically based rendering. When trained on images synthesized by the proposed approach, the Faster R-CNN object detector achieves a 24% absolute improvement of mAP@.75IoU on Rutgers APC and 11% on LineMod-Occluded datasets, compared to a baseline where the training images are synthesized by rendering object models on top of random photographs. This work is a step towards being able to effectively train object detectors without capturing or annotating any real images. A dataset of 600K synthetic images with ground truth annotations for various computer vision tasks will be released on the project website: thodan.github.io/objectsynth.
We present a new approach for rule-based design layout problems, examples of which include automated graphic design, furniture arrangement and synthesizing virtual worlds. We formulate this as a weighted constraint-satisfaction problem which is high-dimensional and requires non-convex optimization. We use a hyper-graph to represent a set of objects and the design rules applied to them. Although inference in graphs with higher order edges is computationally expensive, we present a hybrid solution-space exploration algorithm that adaptively represents higher order relations compactly. We exploit the fact that for many types of rules, satisfiable assignments can be found efficiently. In other words, these rules are locally satisfiable. Therefore we can sample from the known partial probability distribution function for each rule. Experimental results demonstrate that this sampling technique reduces the number of samples required by other algorithms by orders of magnitude. We demonstrate usage via different layout examples such as superimposing textual and visual elements on photographs. Our adaptive search algorithm is applicable to other optimization problems and generalizes many previously proposed local search based algorithms like graph cuts and iterated conditional modes.
We present an approach to synthesize highly photorealistic images of 3D object models, which we use to train a convolutional neural network for detecting the objects in real images. The proposed approach has three key ingredients: (1) 3D object models are rendered in 3D models of complete scenes with realistic materials and lighting, (2) plausible geometric configuration of objects and cameras in a scene is generated using physics simulations, and (3) high photorealism of the synthesized images achieved by physically based rendering. When trained on images synthesized by the proposed approach, the Faster R-CNN object detector achieves a 24% absolute improvement of mAP@.75IoU on Rutgers APC and 11% on LineMod-Occluded datasets, compared to a baseline where the training images are synthesized by rendering object models on top of random photographs. This work is a step towards being able to effectively train object detectors without capturing or annotating any real images. A dataset of 600K synthetic images with ground truth annotations for various computer vision tasks will be released on the project website: this http URL.
Abstract We present an automatic method to recover high‐resolution texture over an object by mapping detailed photographs onto its surface. Such high‐resolution detail often reveals inaccuracies in geometry and registration, as well as lighting variations and surface reflections. Simple image projection results in visible seams on the surface. We minimize such seams using a global optimization that assigns compatible texture to adjacent triangles. The key idea is to search not only combinatorially over the source images, but also over a set of local image transformations that compensate for geometric misalignment. This broad search space is traversed using a discrete labeling algorithm, aided by a coarse‐to‐fine strategy. Our approach significantly improves resilience to acquisition errors, thereby allowing simple and easy creation of textured models for use in computer graphics.
Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and validation pipeline as a black-box and require multiple costly objective evaluations at each iteration. We propose an efficient alternative for optimal synthetic data generation, based on a novel differentiable approximation of the objective. This allows us to optimize the simulator, which may be non-differentiable, requiring only one objective evaluation at each iteration with a little overhead. We demonstrate on a state-of-the-art photorealistic renderer that the proposed method finds the optimal data distribution faster (up to $50\times$), with significantly reduced training data generation (up to $30\times$) and better accuracy ($+8.7\%$) on real-world test datasets than previous methods.
Man-made objects are largely dominated by a few typical features that carry special characteristics and engineered meanings. State-of-the-art deformation tools fall short at preserving such characteristic features and global structure. We introduce iWIRES, a novel approach based on the argument that man-made models can be distilled using a few special 1D wires and their mutual relations. We hypothesize that maintaining the properties of such a small number of wires allows preserving the defining characteristics of the entire object. We introduce an analyze-and-edit approach, where prior to editing, we perform a light-weight analysis of the input shape to extract a descriptive set of wires. Analyzing the individual and mutual properties of the wires, and augmenting them with geometric attributes makes them intelligent and ready to be manipulated. Editing the object by modifying the intelligent wires leads to a powerful editing framework that retains the original design intent and object characteristics. We show numerous results of manipulation of man-made shapes using our editing technique.