The basic theory driving my work isn't really all that complicated, though I can easily get bogged down in the details, an urge I'll have to try to resist tomorrow.
In a nutshell, I'm proposing a model of how we learn and process sequences. Virtually everything you do is a result of either learning new sequences, recognizing ones you've already learned, or generating them. So it's crucial that we try to understand how this works. I was inspired a great deal by Jeff Hawkins' ideas, but I saw a large gap between his theory and how it might actually be implemented based on what we know about the brain. My model is an attempt to try to fill in some of that gap.
First of all, the brain is hierarchical, which just means that some areas are "above" others, while others are "below". So broadly speaking there are three kinds of connectivity:
- Feedforward: from lower to higher areas
- Lateral: between neighbors in the same area
- Feedback: from higher to lower areas
I'm proposing that each of these types of connectivity plays a distinct role in learning and processing sequences.
Feedforward connectivity allows you to "chunk" sequential information. For example, when you learn a phone number, you typically learn it as a chunk of 3 numbers combined with a chunk of 4 numbers. If the whole system is arranged hierarchically, then we can group smaller chunks into larger chunks, up and up the hierarchy, so that we can efficiently store very long sequences.
Lateral connectivity allows you to learn pairwise sequences, for example, what comes after "g" in the alphabet. I'm hypothesizing that the type of learning that occurs between neighbors allows for a kind of domino effect. You hear "a" and it's like knocking over the domino for "b" and then "c" and so on, in a cascading effect. This type of representation is directional (i.e. it's difficult to say the alphabet backwards) and content-addressable (which means that you can reproduce the sequence simply by being given some small part of it, like the first few notes of a song).
You can see how these two types of representations might complement one another. The first is efficient and scalable, but isn't content-addressable. The second is content-addressable, but doesn't scale well.
Finally, feedback connectivity has been hypothesized to push predictions back down the hierarchy, which helps when we're confronted with input that is noisy (e.g. a cell phone conversation). If there is noise or gaps in what we're sensing, the higher level nodes are transmitting via the feedback connectivity what they expect to experience next, and that helps fill in the missing pieces, making the whole system very robust and reliable.
My plan is to implement the model incrementally, and I've already got a preliminary model using feedforward connectivity. It learns by associating the immediate past with the present. Let's say you're learning the alphabet for the first time. You hear "a", then "b". The way the model works is, it stores "a" in a kind of short-term memory, and when "b" is presented, the system binds them together, the delayed "a" and the current "b". It chunks together "ab" and "c" in the same way, storing progressively larger chunks. And it does so using spiking neuron models and learning mechanisms that have been experimentally confirmed in animals and humans.
So that's it in a nutshell. There a many more details I'm leaving out, but that's the gist. Hopefully everything will go smoothly, and by tomorrow evening I'll only have one more hurdle to jump before getting my PhD.