The Rockefeller University » Hidden Traveling Waves in Artificial Recurrent Neural Networks Encode Working Memory

Event Details

Type: Center for Studies in Physics and Biology Seminars
Speaker(s): Arjun Karuvally, Ph.D. candidate, University of Massachusetts Amherst
Speaker bio(s): Traveling waves are integral to brain function and are hypothesized to be crucial for short-term information storage. This study introduces a theoretical model based on traveling wave dynamics within a lattice structure to simulate neural working memory. We theoretically analyze the model's capacity to represent state and temporal information, which is vital for encoding the recent history in history-dependent dynamical systems. In addition to enabling robust short-term memory storage, our analysis reveals that these dynamics can alleviate the diminishing gradient problem, which poses a significant challenge in the practical training of recurrent neural architectures. We explore the model's application under two boundary conditions: linear and non-linear, the latter driven by self-attention mechanisms. Experimental findings show that randomly initialized and backpropagation-trained Recurrent Neural Networks (RNNs) naturally exhibit linear traveling wave dynamics, suggesting a potential working memory mechanism within these networks. This mechanism remains concealed within the high-dimensional state space of the RNN and becomes apparent through a specific basis transformation proposed by our model. In contrast, the non-linear scenario aligns with autoregressive loops in attention-based transformers, which drive the AI revolution. The results highlight the profound impact of traveling waves on artificial intelligence, improving our understanding of existing black-box neural computation and offering a foundational theory for future enhancements in neural network design.
Open to: Tri-Institutional