Chaos and learning in recurrent neural networks
Corwin, Edward M.
MetadataShow full item record
Recurrent neural networks have received a great deal of attention recently because of the variety of dynamic behaviors produced by such networks. However, attempts to train recurrent networks to produce chaotic behavior have met with great difficulty. This work examines a fundamental problem with training recurrent neural networks to produce discrete chaotic sequences and proposes a training approach which addresses the difficulties outlined. A major result of this work is a proof that a continuous model, with a bounded derivative, for a discrete chaotic system does not exist. The implication of this result is to motivate a training algorithm derived using discrete mathematics. Other algorithms assume the existence of a continuous model, which has now been shown not to exist. Networks were trained to several data sets using the traditional continuous methods and the discrete algorithm. The discrete rule improved training accuracy for all networks for the given data sets. The discrete training rule, for which one time step and multiple time step variations are presented, produced better results than Logar's training rule for the Aihara-style network and better results than Pearlmutter's algorithm for his network. A simplified proof of Hayashi's training rule, based on discrete mathematics is also presented and has the advantage of being extendible to a recursive multiple time step training algorithm. A new hybrid network, and the associated learning rules, is also presented in which coupled oscillators are positioned inside of a feed forward network. This approach alleviates some of the difiRculties inherent in training a pure oscillator network and greatly improved training accuracy. Extensions to the weight projections algorithm are also presented. The main results are that a quadratic approximation is the most eflfective choice for fitting a curve to the weight trajectory, that the extrapolation distance can be doubled if points generated later in the trajectory are more heavily weighted than those generated earlier, and that a goodness of fit test can detect a poor projection in advance of making the extrapolation.