This article presents a new natural user interface to control and manipulate a 3D animation using the Kinect. The researchers design a number of gestures that allow the user to play, pause, forward, rewind, scale, and rotate the 3D animation. They also implement a cursor-based traditional interface and compare it with the natural user interface. Both interfaces are extensively evaluated via a user study in terms of both the usability and user experience. Through both quantitative and the qualitative evaluation, they show that a gesture-based natural user interface is a preferred method to control a 3D animation compared to a cursor-based interface. The natural user interface not only proved to be more efficient but resulted in a more engaging and enjoyable user experience.
We present a new comprehensive RGB-D dynamic facial dataset capturing system that can be used for facial recognition, emotion recognition, or visual speech processing. Our facial dataset uses an RGB-D (Kinect) camera to record 20 individuals saying 20 common English words or phrases. Using Kinect facial tracking, we not only record the facial features, but also facial outline, RGB data, depth data, mapping between RGB and depth data, facial animation units, facial shape units, and finally 2D and 3D face representations of the face along with the 3D head orientation. The captured RGBD dynamic facial dataset can be employed in several applications. We demonstrate its effectiveness by presenting a new visual speech recognition that employs three-dimensional spatial and temporal data of different facial feature points. The results demonstrate the our RGB-D dynamic facial dataset can be effectively employed in a visual speech recognition system.
In multi-user virtual environments real-world people interact via digital avatars. In order to make the step from the real world onto the virtual stage convincing the digital equivalent of the user has to be personalized. It should reflect the shape and proportions, the kinematic properties, as well as the textural appearance of its real-world equivalent. In this paper, we present a novel spatio-temporal approach to create a personalized avatar from multi-view video data of a moving person. The avatar's geometry is generated by shape-adapting a template human body model. Its surface texture is assembled from multi-view video frames showing arbitrary different body poses. consistent surface texture for the model is generated using multi-view video frames from different camera views and different body poses. With our proposed method photo-realistic human avatars can be robustly generated.
The creation of high quality animations of real-world human actors has long been a challenging problem in computer graphics. It involves the modeling of the shape of the virtual actors, creating their motion, and the reproduction of very fine dynamic details. In order to render the actor under arbitrary lighting, it is required that reflectance properties are modeled for each point on the surface. These steps, that are usually performed manually by professional modelers, are time consuming and cumbersome.
In this thesis, we show that algorithmic solutions for some of the problems that arise in the creation of high quality animation of real-world people are possible using multi-view video data. First, we present a novel spatio-temporal approach to create a personalized avatar from multi-view video data of a moving person. Thereafter, we propose two enhancements to a method that captures human shape, motion and reflectance properties of amoving human using eightmulti-view video streams. Afterwards we extend this work, and in order to add very fine dynamic details to the geometric models, such as wrinkles and folds in the clothing, we make use of the multi-view video recordings and present a statistical method that can passively capture the fine-grain details of time-varying scene geometry. Finally, in order to reconstruct structured shape and animation of the subject from video, we present a dense 3D correspondence finding method that enables spatiotemporally coherent reconstruction of surface animations directly frommulti-view video data.
These algorithmic solutions can be combined to constitute a complete animation pipeline for acquisition, reconstruction and rendering of high quality virtual actors from multi-view video data. They can also be used individually in a system that require the solution of a specific algorithmic sub-problem. The results demonstrate that using multi-view video data it is possible to find the model description that enables realistic appearance of animated virtual actors under different lighting conditions and exhibits high quality dynamic details in the geometry.
Die Entwicklung hochqualitativer Animationen von menschlichen Schauspielern ist seit langem ein schwieriges Problem in der Computergrafik. Es beinhaltet das Modellieren einer dreidimensionaler Abbildung des Akteurs, seiner Bewegung und die Wiedergabe sehr feiner dynamischer Details. Um den Schauspieler unter einer beliebigen Beleuchtung zu rendern, mussen auch die Reflektionseigenschaften jedes einzelnen Punktes modelliert werden. Diese Schritte, die gewohnlich manuell von Berufsmodellierern durchgefuhrt werden, sind zeitaufwendig und beschwerlich.
In dieser These schlagen wir algorithmische Losungen fur einige der Probleme vor, die in der Entwicklung solch hochqualitativen Animationen entstehen. Erstens prasentieren wir einen neuartigen, raumlich-zeitlichen Ansatz um einen Avatar von Mehransicht-Videodaten einer bewegenden Person zu schaffen. Danach beschreiben wir einen videobasierten Modelierungsansatz mit Hilfe einer animierten Schablone eines menschlichen Korpers. Unter Zuhilfenahme einer handvoll synchronisierter Videoaufnahmen berechnen wir die dreidimensionale Abbildung, seine Bewegung und Reflektionseigenschaften der Oberflache. Um sehr feine dynamische Details, wie Runzeln und Falten in der Kleidung zu den geometrischen Modellen hinzuzufugen, zeigen wir eine statistische Methode, die feinen Details der zeitlich variierenden Szenegeometrie passiv erfassen kann. Und schlieslich zeigen wir eine Methode, die dichte 3D Korrespondenzen findet, um die strukturierte Abbildung und die zugehorige Bewegung aus einem Video zu extrahieren. Dies ermoglicht eine raumlich-zeitlich zusammenhangende Rekonstruktion von Oberflachenanimationen direkt aus Mehransicht-Videodaten.
Diese algorithmischen Losungen konnen kombiniert eingesetzt werden, um eine Animationspipeline fur die Erfassung, die Rekonstruktion und das Rendering von Animationen hoher Qualitat aus Mehransicht-Videodaten zu ermoglichen. Sie konnen auch einzeln in einem System verwendet werden, das nach einer Losung eines spezifischen algorithmischen Teilproblems verlangt. Das Ergebnis ist eine Modelbeschreibung, das realistisches Erscheinen von animierten virtuellen Schauspielern mit dynamischen Details von hoher Qualitat unter verschiedenen Lichtverhaltnissen ermoglicht.
There has been a recent proliferation in wireless infrastructure network deployments. In a typical deployment, an installer uses either a one-time site survey or rules of thumb to place wireless access points and allocate them with channels and power levels. Because the access point location problem is inherently complex and one that requires tradeoffs among competing requirements, these approaches can result in either dead spots or significant unintended interference among wireless access points. This degrades network performance for end clients, with throughput reduction factors of 4x found in field measurements. In this paper, we take a first step towards improving client performance by coordinating choices of channels and power levels at wireless access points using a successive refinement approach. Our contributions are two-fold: first, we develop a mathematical model that crisply defines the solution space and identifies the characteristics of an optimal channel and power-level configuration. Second, we present heuristics that, under some simplifying assumptions, yield near-optimal configurations. We use Monte Carlo simulations to evaluate the performance of our heuristics. We find that the choice of heuristics for transmit power control impacts performance more than the channel allocation strategy, especially at high densities. Also, surprisingly, randomly assigning channels to access points appears to be an effective strategy at higher deployment densities. Taken together, we believe that this study paves the way to designing rapidly deployable real-world infrastructure networks that also have good performance
In these days almost every application on the Internet relies on a type of database to store client data on the server side. These databases are often used to store session identifiers, financial details, medical records, and other privacy-critical data. Consequently, security of these databases is very critical, but it cannot be guaranteed due to modern software complexity and ease in launching cyber-attacks. So if some intruder gets access the database, then he can change the user data to negatively impact the interest and safety of data owner. In this paper, we have proposed a solution to this problem and demonstrated in form of a proof-of-concept. Our solution provides cryptographic proof with each database query. This helps end client to verify the result in real time with minimum overhead even in fully compromised server-side environment. A successful verification status means that the result of database query is correct, complete, and fresh. Our proposed solution consists of database query verification components including (i) query verification client, (ii) query verification server and (iii) blockchain. We have implemented the proof-of-concept in node.js, and evaluation results show that it can work in real time with minimal overhead while guaranteeing freshness, completeness and correctness of data.
Security and dependability are crucial for designing trustworthy systems. The approach “security as an add-on” is not satisfactory, yet the integration of security in the development process is still an open problem. Especially, a common framework for specifying dependability and security is very much needed. There are many pressing challenges however; here, we address some of them. Firstly, security for dependable systems is a broad concept and traditional view of security, e.g., in terms of confidentiality, integrity and availability, does not suffice. Secondly, a clear definition of security in the dependability context is not agreed upon. Thirdly, security attacks cannot be modeled as a stochastic process, because the adversary’s strategy is often carefully planned. In this chapter, we explore these challenges and provide some directions toward their solutions.
Security and dependability are crucial for designing trustworthy systems. The approach “security as an add-on” is not satisfactory, yet the integration of security in the development process is still an open problem. Especially, a common framework for specifying dependability and security is very much needed. There are many pressing challenges however; here, we address some of them. Firstly, security for dependable systems is a broad concept and traditional view of security, e.g., in terms of confidentiality, integrity and availability, does not suffice. Secondly, a clear definition of security in the dependability context is not agreed upon. Thirdly, security attacks cannot be modeled as a stochastic process, because the adversary’s strategy is often carefully planned. In this chapter, we explore these challenges and provide some directions toward their solutions.
This paper proposes a new marker-less approach to capturing human performances from multi-view video. Our algorithm can jointly reconstruct spatio-temporally coherent geometry, motion and textural surface appearance of actors that perform complex and rapid moves. Furthermore, since our algorithm is purely meshbased and makes as few as possible prior assumptions about the type of subject being tracked, it can even capture performances of people wearing wide apparel, such as a dancer wearing a skirt. To serve this purpose our method efficiently and effectively combines the power of surface- and volume-based shape deformation techniques with a new mesh-based analysis-through-synthesis framework. This framework extracts motion constraints from video and makes the laser-scan of the tracked subject mimic the recorded performance. Also small-scale time-varying shape detail is recovered by applying model-guided multi-view stereo to refine the model surface. Our method delivers captured performance data at high level of detail, is highly versatile, and is applicable to many complex types of scenes that could not be handled by alternative marker-based or marker-free recording techniques.