Stefan Miesbach

AbstractThe basic idea of linear principal component analysis (PCA) involves decorrelating coordinates by an orthogonal linear transformation. In this paper we generalize this idea to the nonlinear case. Simultaneously we shall drop the usual restriction to Gaussian distributions. The linearity and orthogonality condition of linear PCA is replaced by the condition of volume conservation in order to avoid spurious information generated by the nonlinear transformation. This leads us to another very general class of nonlinear transformations, called symplectic maps. Later, instead of minimizing the correlation, we minimize the redundancy measured at the output coordinates. This generalizes second-order statistics, being only valid for Gaussian output distributions, to higher-order statistics. The proposed paradigm implements Barlow's redundancy-reduction principle for unsupervised feature extraction. The resulting factorial representation of the joint probability distribution presumably facilitates density estimation and is applied in particular to novelty detection.

Linear map

Higher-order statistics

10.1088/0954-898x_6_1_004

Cite

Citations (37)

Optimal Temperature Control of Semibatch Polymerization Reactors

Springer eBooks (1996)

H. Hinsberger Stefan Miesbach Hans Josef Pesch

Set point

Constant (computer programming)

10.1007/978-3-642-80149-5_9

Cite

Citations (9)

CONCEPTS AND METHODS IN NEURAL CONTROL

Elsevier eBooks (1993)

Stefan Miesbach Bernd Schürmann

Relevance

Neural system

10.1016/b978-0-444-89958-3.50014-0

Cite

Citations (0)

Redundancy reduction with information-preserving nonlinear maps

Network Computation in Neural Systems (1995)

Lucas C. Parra Gustavo Deco Stefan Miesbach

AbstractThe basic idea of linear principal component analysis (PCA) involves decorrelating coordinates by an orthogonal linear transformation. In this paper we generalize this idea to the nonlinear case. Simultaneously we shall drop the usual restriction to Gaussian distributions. The linearity and orthogonality condition of linear PCA is replaced by the condition of volume conservation in order to avoid spurious information generated by the nonlinear transformation. This leads us to another very general class of nonlinear transformations, called symplectic maps. Later, instead of minimizing the correlation, we minimize the redundancy measured at the output coordinates. This generalizes second-order statistics, being only valid for Gaussian output distributions, to higher-order statistics. The proposed paradigm implements Barlow's redundancy-reduction principle for unsupervised feature extraction. The resulting factorial representation of the joint probability distribution presumably facilitates density estimation and is applied in particular to novelty detection.

Linear map

Higher-order statistics

10.1088/0954-898x/6/1/004

Cite

Citations (12)

LIST OF CONTRIBUTORS (in alphabetical order)

Elsevier eBooks (1993)

Mazuo Aizawa Hans-Werner Bothe Eric R. Cosman Andreas Dress Rolf Eckmiller

10.1016/b978-0-444-89958-3.50026-7

Cite

Citations (0)

Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps

Neural Computation (1996)

Lucas C. Parra Gustavo Deco Stefan Miesbach

According to Barlow (1989), feature extraction can be understood as finding a statistically independent representation of the probability distribution underlying the measured signals. The search for a statistically independent representation can be formulated by the criterion of minimal mutual information, which reduces to decorrelation in the case of gaussian distributions. If nongaussian distributions are to be considered, minimal mutual information is the appropriate generalization of decorrelation as used in linear Principal Component Analyses (PCA). We also generalize to nonlinear transformations by only demanding perfect transmission of information. This leads to a general class of nonlinear transformations, namely symplectic maps. Conservation of information allows us to consider only the statistics of single coordinates. The resulting factorial representation of the joint probability distribution gives a density estimation. We apply this concept to the real world problem of electrical motor fault detection treated as a novelty detection task.

Decorrelation

Representation

Independence

Novelty Detection

10.1162/neco.1996.8.2.260

Cite

Citations (145)

Efficient gradient computation for continuous and discrete time-dependent neural networks

Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. (1991)

Stefan Miesbach

The author provides calculus-of-variations techniques for the construction of backpropagation-through-time (BTT) algorithms for arbitrary time-dependent recurrent neural networks with both continuous and discrete dynamics. The backpropagated error signals are essentially Lagrange multipliers. The techniques are easy to handle because they can be embedded into the Hamiltonian formalism widely used in optimal control theory. Three examples of important extensions to the standard BTT-algorithm provide proof of the power of the method. An implementation of the BTT-algorithms which overcomes the storage drawbacks is suggested.< >

Backpropagation

Expressive power

Formalism (music)

10.1109/ijcnn.1991.170737

Cite

Citations (5)

Synthesis of Optimal Strategies for Differential Games by Neural Networks

Birkhäuser Boston eBooks (1995)

Hans Josef Pesch I. Gabler Stefan Miesbach Michael H. Breitner

The paper deals with the numerical approximation of optimal strategies for two-person zero-sum differential games of pursuit-evasion type by neural networks. Thereby, the feedback strategies can be computed in real-time after the training of appropriate neural networks. For this purpose, sufficiently many optimal trajectories and their associated open-loop representations of the optimal feedback strategies must be computed, to provide data for training and cross-validation of the neural networks. All the precomputations can be carried through in a highly parallel way. This approach turns out to be applicable for differential games of more general type.The method is demonstrated for a modified cornered rat game where a pursuing cat and an evading rat, both moving in simple motion, are constrained to a rectangular arena. Two holes in the walls surrounding the arena enable the rat to evade once and for all, if the rat is not too far from these holes. The optimal trajectories in the escape zone can be computed analytically. In the capture zone, a game of degree is employed with terminal time as payoff. To compute optimal trajectories for this secondary game, the time evolution of the survival region for the rat is determined via a sequence of discretized games.The combination of these methods permits the computation of more than a thousand trajectories leading to some ten thousand sample patterns which relate the state variables to the values of the optimal strategies. These data exhibit characteristic properties of the optimal strategies. It is shown that these properties can be extracted from the data by use of neural networks. By means of the trained networks, about 200 trajectories arc finally simulated. The pursuer as well as the evader acts according to the controls proposed by the neural networks. Despite the simple structure of the neural networks used in this study, the strategics based upon them show a reasonable, close to optimal performance in a large variety of simulations of the pursuit-evasion game under consideration.

Stochastic game

Differential game

Sequence (biology)

10.1007/978-1-4612-4274-1_6

Cite

Citations (21)

Symplectic phase flow approximation for the numerical integration of canonical systems

Numerische Mathematik (1992)

Stefan Miesbach Hans Josef Pesch

Ansatz

Runge–Kutta methods

Symplectic integrator