Information theoretic derivation of network architecture and learning algorithms

Roger D. Jones,C.W. Barnes,Y.C. Lee,W.C. Mead

Information theoretic derivation of network architecture and learning algorithms

1991

Using variational techniques, the authors derive a feedforward network architecture that minimizes a least squares cost function with the soft constraint that the mutual information between input and output is maximized. This permits optimum generalization for a given accuracy. The architecture resembles local radial basis function networks with two important modifications: a normalization which greatly reduces the data requirements, and an extra set of gradient style weights which improves interpolation. Learning on the linear weights is by linear Kalman filtering. Performing gradient descent on the composite cost function obtains a learning algorithm for the basis function widths which adjusts the widths for good generalization. A set of learning algorithms is obtained. The network and learning algorithms are tested on a set of test problems which emphasize time series prediction. >

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations