Connecting Biological Detail with Neural Computation – In this article I am going to talk about our recent research efforts to connect low-level biological detail with high-level neural computation. And really, in the best case, this talk will serve as some inspiration for others to try to apply the methods we present here to their own cognitive models. So, without further ado, let’s get started!
We know that cognitive processes are ultimately grounded in neurobiology, and, today, more than ever before, we have access to a vast amount of data on individual brain microcircuits. We know how cells tend to be connected, what their firing rate distributions are, and so on.
And yet, as cognitive scientists, we often implement models at algorithmic and computational levels and tend to glance over the biological details a little bit. I would argue there are at least two reasons why, in some cases, it may be a good idea to think about biology a bit more deeply.
First, taking neurobiology into account helps us to validate cognitive theories. If we are unable to map our algorithm onto brain circuitry, then maybe our model is incorrect. Second, when exploring brain circuitry, we may get ideas as for what algorithms could match that circuitry well; so we generate hypotheses about brain function. Now, the elephant in the room of course, is that accounting for low-level biology, while at the same time building high-level cognitive models, is incredibly difficult!
One proposed solution to this problem, are methods such as the Neural Engineering Framework and the related Semantic Pointer Architecture. The idea behind the NEF is to describe cognitive theories in terms of dynamical systems together with a set of biological constraints. We then take all of this and “compile” it, using software such as Nengo, into a corresponding spiking neural network.
Our goal in this work is to extend the NEF to support additional biological constraints, particularly some we often encountered in the neuroscience literature. The NEF was originally designed to take neuron counts, population-level connectivity, firing rates, neural tuning properties and synaptic filters into account. What we do here is to extend the NEF, as well as Nengo, to, in addition, support purely excitatory and inhibitory neurons, spatial connectivity constraints, and convergence and divergence numbers.
Of course, we wanted to somehow test what the impact of these biological constraints is regarding the function that we are trying to implement. The particular system we chose here to test this is a model of eyeblink conditioning in the cerebellum. Without going into much detail, in eyeblink conditioning, a subject learns to produce an eyeblink a precise time after some stimulus, such as a tone, occurs. What makes this an interesting task, is that it is still unclear how exactly the Cerebellum manages to generate these precise timings.
One hypothesis — and this is what we focus on here — is that the Granule-Golgi microcircuit transforms incoming signals into some kind of temporal basis space. And it then learns to decode arbitrary delays from this temporal basis space by recombining the granule cell activities in the Granule to Purkinje projections. Unfortunately, we do not know, how input signals could be translated into this “temporal basis space”.
When we started to look more deeply into the Cerebellum about a year ago, we were immediately reminded of the Delay Network, developed by Aaron Voelker, a former PhD student in our lab. The Delay Network is a recurrent neural network that, well, translates an input signal into a temporal basis space. So, we decided to try mapping the Delay Network onto the recurrent Granule-Golgi circuit.
The idea behind the Delay Network is that we want to construct a dynamical system that “remembers” its input history over a time-window of length theta. Perhaps surprisingly, the best way to build such a dynamical system, is to implement a delay of length theta. Because then, the system must somehow remember in its internal state everything that happened between now and theta seconds ago.
Mathematically, we can write a delay like this: we want a function u hat that returns the state of our input u as it was at time t minus theta. To turn this into a dynamical system, we can look at the Laplace transformation of a delay, which is this exponential. So, with a little bit of math, and half a PhD later, we can optimally approximate this transfer function as a q-dimensional, numerically stable linear dynamical system, that compresses the history of the input u into the state vector m.
Just as an example, here is a rectangle pulse as an input u, and these are the dynamics we get for the internal state vector m. Now, remember from earlier, that the NEF was designed to translate dynamical systems into spiking neural networks. So, we can really just use the NEF to translate the above equations into a recurrent neural network and what we get is a spiking neural network, that represents the input history over a time window theta in its momentary neural activities.
Crucially, this means that we can decode any function over time from the neural activities, including, but not limited to, delays. Coming back to our extensions of the NEF, the first thing that we account for is that granular cells are purely excitatory. Put differently, they can only invoke positive post-synaptic currents. Conversely, Golgi cells are purely inhibitory and can only invoke negative currents. This is an example of Dale’s principle. In general, individual neurons are purely excitatory or inhibitory. To incorporate this into the NEF, we ended up with this non-negative least squares optimization problem.
Essentially, we try to find non-negative weights, that scale the positive currents of excitatory pre-neurons, and the negative currents of inhibitory pre-neurons, such that some desired target current is reached. The second extension we implemented are convergence numbers, as well as spatial connectivity constraints. The convergence number of a neuron specifies the number of incoming synaptic connections.
Normally, in the NEF, we assume all-to-all connectivity. But for granular cells, the number of input synapses is actually really small! Each granular cell just receives input from two to five pre-neurons. Furthermore, granular cells only connect to Golgi cells that are close by. We implemented these constraints by randomly selecting the set of possible pre-neurons for each post-neuron.
The probability to select a pre-neuron, depends on how close neurons are in 2D space. So we systematically assign a location to each neuron in our model, and compute this probability matrix from which we sample a limited number of pre-neurons for each post-neuron. When we were working on this model, we wanted to systematically explore what the impact of individual mechanistic constraints is with respect to the function we are implementing, namely being able to decode delays from the neural activities of the granular cells. So we started with a direct implementation of the mathematical model.
This model receives some spiking input, filters it, and then uses a perfect integrator to solve the delay network dynamics. We then used standard NEF methods, to replace the perfect integrator with synaptic filters and recurrently connected spiking neurons. The next step was to split the recurrent neuron population into a set of “granular” cells, and a set of “Golgi” cells. Note that Golgi cells are less numerous than granular cells, but the ratio of one to ten we use here is off by one order of magnitude so we can keep the network small and simple for now.
In the next step, we account for Dale’s principle. And in this last step we account for spatial constraints, convergence numbers, and we bring the Golgi to granule ratio to one to one hundred. To evaluate how well the system works, we feed some test inputs into the system, and record the neural activities in the granular layer.
We then use least-squares to decode various delays from the neural activities. One of our test inputs are these rectangle pulses, since this is what researchers often use in experimental settings. We later systematically vary the pulse-width across experiments to get some variety in the inputs. As you can see, when doing this kind of experiment for the most detailed network we mentioned before, we can for example, decode these delays.
This works quite well, though the rectangle pulses are smoothed out, because the delay network compresses time into a low-dimensional state vector. The other type of input is band-limited white noise. And again, you can see that we are able to decode different delays from the neural activities, though the errors are a little higher, especially for larger delays.
Here we have exactly the same experiment as what I’ve just shown to you, but systematically evaluated for all five networks, different delays and pulse widths or bandwidths. As we would expect, the direct implementation has the smallest overall error. As we keep adding biological detail, like spiking neurons, separate Golgi and granule populations, and Dale’s principle, the error increases, but qualitatively speaking, we are still able to decode delays from the neural activities, except for the most extreme parameter combinations.
When increasing the neuron count, and accounting for the Golgi to granule ratio and spatial connectivity, the error actually goes down slightly. Especially for some of the pulse experiments here. One interesting experiment that we can do with models like these is to vary some basic biological parameters, which is something researchers can’t easily do in biology!
For example, we can ask what the impact of the synaptic time constant is in the granule to Golgi connections. Changing this number over multiple experiments suggests that the error seems to be minimal for time-constants of about 50-60 milliseconds, close to the 70ms observed in nature. Another interesting example are the granule cell convergence numbers. Remember that each granule cell only has up to five dendritic synapses. Here we change the maximum number of pre-neurons the optimizer can choose from.
When we do this, the corresponding error remains relatively constant, even for small convergence numbers. We can do the same thing for the Golgi to granule convergence number and here the picture is a little different, and the error actually does go down more significantly. Still, keep in mind that these convergence numbers are upper bounds. The optimizer may still choose to set some of the connection weights to zero.
This diagram shows the actually measured convergences in our final network for different desired convergence numbers. Even for large desired convergence numbers, he actual distribution of convergence numbers stays centred at about one to three Experiments like these might hint at why some parameters are as observed in nature; for example the time-constants being in this particular range, or the convergence numbers being as small as they are. So, this is a kind of hypothesis generation.
These experiments serve as a kind of model validation as well. If our model works best for parameters we do not find in nature, then our model may be wrong or incomplete. So, to summarize. We extended the NEF to include more low-level constraints, such as Dale’s principle and spatial connectivity constraints. We then mapped the Delay Network, which is a recurrent neural network capable of compressing a window of time into neural activities, onto the Granule-Golgi microcircuit.
We demonstrated that adding more biological detail impacts the performance of the system in interesting ways. And, as we just saw, we can use our detailed model to perform parameter sweeps, which helps us to validate and generate hypotheses about brain function. And to conclude, these methods can in principle be applied to cognitive models as well, in particular in conjunction with the Semantic Pointer Architecture that I did not talk about here.
We really hope that cognitive modellers may find our methods helpful and may consider using them in their own models. You can try out the software library nengo-bio that we developed to conduct these experiments! You can find the library at this URL. And with that, I’d like to thank you for your attention and the organizers and reviewers for giving me the opportunity to give this talk at ICCM.
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!