Inhibition

I implemented more sophisticated mechanisms for inhibition… Now an inhibitory neuron is identical in behavior with a regular neuron.

So I implemented 2 mechanism, 1) the inhibition would be at the neuron body and 2) the inhibition acts directly on synapses. None of them makes any sense.. When inhibiting the body (1), synapses are not protected by LTD/LTP effect => the net effect is a decrease in the firing rate and there is no permanent inhibition for any patters.. When inhibition hits, couple of activation cycles are skipped so you have patches of decreased frequency of firing alternating with no firing. When inhibition is at synapse level, I end up only with skipped activation cycles, the firing frequency remains the same since synapses are protected by LTP/LDP effects..

In both cases I see no use for that delay in activation, because is not a single patterns that is delayed, all of them are. The separation between patterns is not great (couple of cycles apart), the problem comes also from inhibitory firing with very low frequency. Their activation depends on the activatory neurons, which have decreased activation themselves as you go up in the layer number.

I sort of defined my “objective” function for a synapse:

I also defined LTD and LTP, I initially believed they are just the fitting event for the objective function, but it seems they have multiple roles .

I made a lot of progress but still no “significant” progress..

Long term depression / potentiation

I use these terms very loosely to mean the increase (LTP) or decrease (LTD) of potential delivered by a synapse to the neuron body. I used some forms of LTP/LTD in my previous versions but they were never meant to approach the biological equivalents. Now, I spent some time to read what is known in biology about LTP and LTD and while there are tones of papers on the subject, I could not find anything that has explored the need for such mechanisms.. They are used in “leaning”, that’s a very vague statement.. I’ve been thinking and I cannot find a use case for them. I found a use for an LTD event during depolarization of postsynaptic neuron, but no uses for LTD/P associated with low/high frequency inputs.. What is low/high frequency ? They seem arbitrary to me. I can use whatever frequency in my code, I should be able to link these terms with something… but to what ?

However, I believe the main reason for having an altered synaptic potential is to change the firing frequency of the postsynaptic neuron… This conclusion troubles me, firing rate is crucial in selecting / separating events (inputs), any alteration (or missing alteration) can be picked up by the inhibitory neuron and amplified, resulting in vastly different results even when the initial change in synaptic output was extremely small. So if I don’t add them now and don’t understand them, they may come back to haunt me…. yet, I don’t need them..

My plan is to explore the implication of LTP and LTD on various other variables but with no certain goal in mind, I find that both boring and difficult.. Is there a paper showing they are actually “long” term changes ? I found a paper saying that very few last up to a week and most alterations vanish within hours. I don’t consider that, long term…

Does LTP/LTD stop ever ? with age perhaps ? for certain layers maybe ? Do they become less frequent ? So many questions… Given my difficulties in transporting signal through layers is still feasible that some LTP/D events would be extremely difficult to change in deeper layers, so they in the end could be viewed as long term and part of the learning mechanism.. So they are long term because they are hard to change… speculations …

Inter-layer transmission

If neuron in layer 1 (L1), requires 10 inputs to fire, and those 10 inputs are delivered in 10 cycles, another neuron in layer 2, requiring also 10 inputs for activation, is activated in 100 cycles by the neuron in L1… In third layer, the cycles required for activation is 1000… So this cannot work like this.

I was aware of this issue since the beginning but I have hoped I can solve it by increasing synaptic efficiency so basically neuron from L2 would require not 10 inputs from L1, but say just 1… That would have been acceptable…The problem with this approach became apparent very late, by increasing synapse efficiency, the selectivity of the post-synaptic neuron decreases. So the solution I envisioned proved to be a dead end. Now I’m considering other approaches to deal with this slow transmission from layer to layer..

  1. Would be to have multiple synapses between Neuron from L1 and neuron from L2. This does not look very promising from various reasons, but maybe in combination with other ideas, could work… not necessarily make 10 synapse, but even 2 synapses would reduce significantly the delay.
  2. have much more neurons in L2 then in L1. And those extra neurons would serve as some sort of amplifier .. would bind among themselves, and excite each other in a bizarre loop. I have played with such loops in the past but they resulted in continuous excitation. Maybe they could be used to store more patterns too… I was planning to add more neurons in L2 anyway, so I’m more inclined to start with this approach.
  3. accept a serious reduction of signal in L2… Basically 10 neurons from L1 could link to a single neuron in L2, and that neuron would fire immediately after the 10 neurons from L1 fired because it receives 10 inputs. This could be part of the solution, but I don’t see this as acceptable (this is what is happening right now by default, when there are multiple binding from L1 to L2)
  4. Something else that is unknown now…

I’m also not happy with the inhibitory neurons… By acting fast (require just 1 input to go active) and being 100% efficient, removes some of the learning rules I have envisioned.. They are not in my immediate focus but they are bothering me..

The new synapse kinetics work extremely well, beyond my expectations.

Adding time #2

Seems by adding kinetics to synapses I added also time to the algorithm. But time has always been an elusive variable. Time is the rate of change for some events. So time is not really correlated with the outside arbitrary unit of time and will depend on the computer power. It is very possible to correlate this internal time to the outside time but for now it will serve no purpose. However time is now embedded in multiple processes. What can be learned is now indirectly linked to time. The time component will determine what is correlated and what is “important”. Time also seems to determine how many patterns can a synapse learn without internal changes. Slower kinetics would allow for more patterns being learned.. In a way would increase precision or selectivity. Increase precision requires more processing cycles.

On the update side.

I implemented the new kinetics at synapse level but I need some sort of kinetics at neuronal level. Without it I cannot decide when a neuron was active. Inhibitory neurons still work on the old simpler mechanism so inhibition is instantaneous and inhibits 100%. I may have to change that in the future.

Dendritic Growth 3

I have programmed in the first step, linking synapses on multiple directional dendrite, but no branching yet. However, when running this model I discovered that my kinetics for synapse potential, don’t work well, they were good enough for previous model, but in essence a kluge job that worked for the wrong reasons. Basically I’m not converging well to the firing potential of the neuron, I’m overshooting and the correction, which is not good either, messes up the timing of the firing event. The result is devious and cascades into the following layers resulting eventually into a wrong learning pattern. I added ordinary differential equations, but it did not help, the problem comes from, dP/dC, where P = potential and C is the cycle number … The cycle number is an integer and I can’t do anything about it => the convergence is still poor => W (AMPA receptors) fluctuates from pattern to pattern => a delay in forming a stable pattern in the next layer => that patterns is completely inhibited if it competes with another pattern

Synaptic Potential, left my data, right data from this paper

I don’t want to reproduce real biological data in my simulation, but when I get stuck I look for inspiration in real data :). For now I’m only interested in the upward trend but I still wonder why the downward side looks so …. not symmetric .. Why does it take so long to go back to the initial state ? How long does it take though ? Can a neighboring neuron fire twice in the amount of time it takes for this synapse to regenerate ? Can this neuron fire again while this particular synapse is regenerating ?

I extracted many equations from the code hoping to solve them mathematically… but no luck there either, I don’t know how to solve so many linked simple equations.. but maybe someone does ..

Dendritic Growth 2

As usual, this is much more complicated than I thought. I can’t really decide when to branch, how much to branch and how long to grow a dendrite… Look at the picture bellow, I want to connect A with C, but there is no dendrite growing directly on that direction, so it has to branch to reach C.. But branching can be done from various points as shown in RED. A random branching, from whatever point on the GREEN lines (default dendrites) is out of the question.. The branching has to be done from precise points, where the dendrite connected with a different neuron, so I’m left with 4 branching points… Should I link C with 4 synapses from A’s dendrites? Just one ? What if that breaks ?

green – default dendrite, red – possible dendritic branching

I don’t have enough information to decide on a course of action so I’m left with trial and error, like I did for the synapses kinetics (about 30 models that did not work).. So far I didn’t do much, coding wise, I only change the code to accept directionality for dendrites and decided to go with 8 default directions, basically 8 dendrites that forms only when they are needed, but unless the neuron is at the edge of the matrix, all 8 are needed .. Also I decided to remove vacancies (location previously occupied by a synapse) from available places for new synaptic binding. That place will remain empty, presumably separating two patterns…

Dendrite Growth

So far I worked with a simplified model with a single dendrite. But moving the signal to layer two with more inter-neuronal connections allowed, it led to obvious errors. Why is this important ? In my model distance matter, synapses further away from the neuronal body contribute less to the overall neuronal potential. The dendrites, with their growth, provide that variable distance. I could not find a model in literature to fit my requirements so I came up with two very different models, eventually I decided to start with the one that seems easier because it does not require any calculations (as in geometrical calculations). This model should lead to structures close enough to what is observed in literature, but regardless if it’s close or not, it should clearly link synapses based on distance. It will allow for branching, this is very important since branching allow for synapses with same distance to the neuronal body.

However there are still many questions unanswered. What happens when a synapse is removed from a dendrite ? Does it bind to a further away position ? It is removed forever ? It binds to a different dendrite ? Should I allow multiple synapses in between 2 neurons ? What happens with the vacancy left by the removed synapse ? remains empty ? is occupied by other synapses ? a further away synapse takes its place ? What should I do about far away synapse (away from the neuron), their contributions to the overall potential is insignificant even with a linear decrease in contribution with distance, I now have an exponential decrease so it’s even worse.. Sure in some cases the synaptic strength (AMPA receptors equivalent) increase and the contribution is a bit bigger, but still small. Whys so many direct connections with so small contributions ? The signal would still reach a target neuron through its neighbors, more like in a GNN network .. that would make more sense to me.

As far as I can tell, I now have a good model for :

  1. Glutamatergic synapse (kinetics of glutmate and of AMPA)
  2. GABA synapses (half baked.. is acting on the axon resulting in 100% percent inhibition, but is also affecting active synapses.. so it’s a half man half bear kind of a situation.. maybe half pig as well)

What is similar ?

I’ve been getting many unexpected results in my quest for invariance. I thought is because of a bug in coding or a bad theory. But no, everything seems to be in order, there are no errors that I’m aware of.. So I was left with the improbable.. I don’t get identical results for identical patterns, because “similar” is not what I assumed it to be. Something is similar (or identical) not only when is formed from identical components but it also needs to have the same history. When I was thinking of “context” I was usually thinking only about the “stuff” around, did not think that I need the whole history behind that event (history in context is : sequence of patterns in time).

In the meantime I have some explanations for the lack of synchronization I now encounter on a regular basis.

  1. I don’t have horizontal cells or amacrine cells, in my code this results in an out of phase state that cannot be corrected (my input cells don’t fire every frame but on a certain frequency, set every other frame at this point in time). I have a button that brings them all to frame 1 when this happens, but this is just a cheap easy fix.
  2. The input cells are not out of phase, but patterns within the same visual field, fire at different frequencies. Not sure what to do about this one, could be normal and be somewhat “fixed” within the next layer. I was thinking to link inhibitory neurons among themselves so if one is activated it it will inhibit the inhibitory neurons around( meaning it will inhibit the inhibition they were providing)

Is learning guaranteed ?

I’ve been spending a lot of time trying to reach a state in which I’d get a partial Invariance… I observed that there are at least 2 pathways that would lead to different results..

  1. The frequency and sequence of patterns – only certain sequences would lead to the desired result. There is no way to guarantee for a certain result. The only way to guarantee a result is to have a precise training set which is not what I want, but so far I could not think of a way to correct for an imbalanced training set..
  2. Timing. Learning depends on time, meaning at time t1 we can have result R1 and at time t2 (where t2>>t1), we have a result R2. There is a time limit after which there is no change, but again, there is no guarantee that I get a certain results and no way of assessing when something learned would not change with time.

Both to me seem reasonable but very annoying… I find it very hard to set up an objective function with so many unknowns..

Invariance to nowhere

I’ve already done many simulations with the new model. There’s no invariance in sight. I also have no theories that would predict this elusive invariance, so much so that I’m not so sure anymore that this is obtained at the neuronal level. So I’ve prepared plan B. Even without invariance learn as many patterns possible, very much like the regular ML used these days. Well, that also failed. When active, neurons send a call to other neurons looking for binding partners. But how are the neurons active in the first place ? To solve this problem I linked from the beginning, neuron ij from L1 to same ij neuron from L2. That is proving to be a limiting factor now since in L2 I need to be able to form more patterns than in L1. Even if L2 has more neurons, they never get activated and they never connect to anything. In my previous post I showed a 2by2 matrix learning a 2 pixel pattern, but the cross lines were missing as a pattern because I did not have enough neurons in L2 to learn those patterns.

Anyway I need to make neurons activate “spontaneously”, till they make their first connection at least, see if that solves my problem.. Why not make fully connected layers ? Theoretically, that should work too but is a bit impractical because training takes a lot of time. Also this random activation is not as simple as it seems, because it will result very fast in a fully connected network.