Long term vs short term memory

I’m still in the process of changing the code to allow for variable size for each layer. Very tedious, seems all function are affected by this change.

Anyway, I still have time to work on theoretical aspects. So I’ve been thinking, even if my algorithms are not there yet, it still allowed me to see some troubling results. It seems to me that changes made during the “learning” process are “too irreversible”. So all changes I make within the neuron are event driven, a single process is an equilibrium process which is to say that forward rate is event driven but the revers rate is time (cycle) driven. So while the changes are not irreversible, they can only be reverted by an event. And if that event is not present the change will remain.. This leads to situations where the AI will see things that are not there. I observed this process in previous versions where I mentioned that the current result depends “too much” on the previous patterns , but the context seemed different at the time. (see my previous posts). While I did not solve the synchronization issue from my previous posts, I’m now running “empty” cycles to bring neurons to an neutral state. This way I can be sure that the learning process is the one causing the ghosting and not the lack of synchronization.

So the question is now: when, where, how to store long term memory and how to retrieve it in such a way that will still make my AI see things that are not there ๐Ÿ™‚ … as it does now… Where ? In some other deep layers (hippocampus ?), When- that is unknown, How –similar to how they are stored now, but it depends on When. How to retrieve it ? –I need another variable in my synapse definition, so at some point down the line, I’ll change that variable based on inputs from layers storing long term memory.. I think..

On a different topic, my learning mechanism is for all intents and purposes a fitting function.. Here there is also a problem… much like Pauli exclusion principle for electrons it seems that synapses cannot occupy the same space… so some “energy” values should be excluded or limited … That, I hope to clarify when the code is up and running and I can do some real simulations. But looks… complicated.. It does not seem, but is linked to the previous topic ๐Ÿ™‚

Synaptic strengths and updates

I worked hard in the past month to implement the new theory but I made little to no progress. Every time I think I understand something and solve some problems, I find that things are much more complicated than I previously believe them to be. New complexities that I did not think of… at all.. So for the first time I’m actually pessimistic…

Synaptic strengths, in my model is defined by 3 main variables, the synapse is defined in total by 5 variables. At some point I realized that there is a play between learning new things and remembering old things and the new theory should have solved the issue. The mathematical model is limited to 2 synapses and I cannot actually predict what would really happen when the neuron is inserted into the network. But in theory should solve the issue of learning/remembering. I inserted the neuron into the network immediately because in fact the coding does not support anymore testing a single neuron by itself. But once inserted into the network, the network is at the same time too simple to test the learning mechanism, but also too complex when I add more than 4 neurons… So I need to get back to a simpler version where I can test only a single neuron but with complex patterns.

I concluded that while simple and complex cell may seem very similar in their biology, from a functional stand point, they must be very different. Simple cell select precisely some patterns while complex cells differentiate patterns through firing rates. I believe the simple cells are not that dependent on the firing rates, but in the model I’m using this behavior cannot really be ruled out.

I may have found a way to selectively group signals based on their complexity. Meaning complex signals will converge to the center of the surface, while the more simple patterns will stay on the periphery. But, at this time, I only have two areas and I can’t be sure more areas will form or how should they form. Also an area (in current simulation a 4 by 4 matrix) is somehow hard to define in a general way, right now I’m defining it as being the edge where the inhibitory neurons are overlapping, allowing for more signal but at the same time being more controlled (it activates less because is more often inhibited).

Inhibition works, somewhat … but it’s unpredictable because of the firing rates. Firing rates lead to a general unpredictability because I don’t know when to synchronize the neurons. Synchronization is possible but I don’t know the right triggering event. Sometimes they should synchronize other times they shouldn’t, because I don’t obtain the “correct” answer. Correct answer is also poorly defined.

I’ve also concluded that the network cannot be precisely corrected with specific feedback. Any feedback (back propagation equivalent in a way) has to be nonspecific. Meaning that it may lead to a desired outcome but it will also lead to unpredictable changes, secondary to the primary desired outcome, or, sometimes, the undesired outcome will be the primary outcome. This conclusion has come from trying to implement a “calling function” where the neuron will send a signal within an area, signaling the neuron is ready to accept new connections.

In conclusion I have very few good news, learning seems to be working but I can’t be very sure. Firing rates may be separating patterns but I can’t be sure of that either. Signal grouping is limited to only 2 areas where I believe it should be many areas..

What’s next ? Next I’m gonna go backwards.. I need to create a branch where I can test the learning mechanism with a single neuron, but with complex input. I’ll see from there..

New case use for AI: Colon Polyp Detector in medical images as a free android app

While I was busy with this new AI concept our team worked on more down to earth apps using current state of the art AI algorithms.

xLandC in collaboration with RNASA-IMEDIR group at University of A Coruรฑa (Spain) implemented a new proof-of-concept mobile tool to detect colon polyps from colonoscopy images: Polyp detector app โ€“ available at Google Play store at https://play.google.com/store/apps/details?id=com.xlandc.polypdetect. A deep learning classifier trained with a free dataset has been implemented as a tensorflow lite model into an Android free app using Flutter (all tools from Google).

The model is able to detect only colon polyps in medical images and we will improve it with future updates. Itโ€™s free of ads and no user data is stored, tracked or used in any other way by xLandC. All predictions are evaluated locally in the user’s device and the AI model is also locally stored. There is no central server. With each update we are able to change the model with a better one and to improve the functionality of the application.

Just browse for a picture from your colonoscopy image and make your prediction. In addition, the camera could be used to take a picture.

More failures, but a new learning theory

While I was focusing on fixing the synchronization issue, I lost sight of another serious issue. Once I introduced the semantics of dendrites, I lost the learning mechanism.. Not only that but I also lost the inhibition mechanism.. Inhibition could have been fixed somehow, but I realized that without the current mechanism of inhibition is not possible to synchronize neuronal activity. Maybe I need 2 inhibitions to get back to the previous state.

Anyway when synchronization was somehow fixed, I realized that there was no learning anymore. Of course I did not think through the changes I introduced with the semantics of dendrites… I read more neuroscience articles watched some online lectures.. got disappointed on the lack of clarity but eventually I came up with a new theory for neuronal learning inspired by what I learned. I went and ran multiple simulation scenarios on an Excel sheets and it seems to work.

The new theory is unfortunately much more complicated, meaning many things could go wrong but it has some clear advantages and is much more in line with what’s known (or assumed) in biology:

  • learning now integrates firing rates (which I despise because it makes understanding more difficult)
  • multiple synapses on multiple dendrites can activate now to generate an activation potential.
  • there is a new mechanism for “dendritic growth”, which is to say that I now have a rule, based on activity, for when a dendrite can accept connections. The model does not tell me when to seek new connections though..

The drawback ? Firing rates… I still use the concept as defined in my previous post so I’m not using time but cycles to calculate a firing rate. I’m still hoping that I won’t have to use actual time for firing rates. Also there are still many unknowns, LTD is not so clear anymore, so I may still end up with yet another failure. In terms of coding, it should not be difficult to implement, but it will take some time to understand if something is “right” or “wrong”.

Firing Rates and Neuronal Synchronization

I’m defining firing rate as:

Cycles needed by a single presyanptic neuron to activate a postsynaptic neuron,

Assume a single presynaptic neuron could activate a single postsynaptic neuron. If the presynaptic neuron has to activate 3 times before it can activate the postysnaptic neuron then the firing rate is 1/3. The firing rate is then a function of how much potential can a single synapse bring per activation. So close (proximal) synapses to the neuron body would generate higher firing rates than distal synapses. So the firing rate is not entirely depended on the firing rate of the presynapstic neuron.

Another definition is for “semantics of dendrites” .

The semantics describes the relationship between two dendrites. If we define a Dendrite a set of synapses (with values ON/OFF) , then the semantic between dendrite A and B will be the percentage of synapses from dendrite B being ON, while dendrite A is generating an activation potential.. Still working on this definition.

I hit a dead end with the current development branch. I calculated the theoretical dendrites and then tried to actually obtain all of them within my simulation by doing a precise training. It did not take long to realize that most of the dendrites could not actually form because they were being blocked on way or another. So I decided to switch to a more advance model in which I included the semantics of dendrites. Till now dendrites were independent. The model is actually what I started with but was too complex to handle and it had a theoretical drawback, the AI would go “blind”. I implemented most of the changes and indeed the blindness problem is there and I don’t have yet a solution. What do I mean by blindness ? Is some sort of over-training which leads to synapses being definitively removed from neurons.

But more concerning than blindness is still neuronal synchronization. I also implemented a form of “firing rates”. But that immediately added a more chaotic behavior. Running same pattern would end up with different responses on different cycles, first would show up most probable response then the neurons will go through a cooling cycle and the next most probable response (if any possible), would show up…

Firing rates bring more chaos to an already chaotic model

Neuronal refractory period

The firing rate of biological neurons is limited by the refractory period, a brief period after activation during which a second activation is not possible.

I managed to sync neurons firing within the same pattern… But that does not seem to be enough… Neighboring neurons, not part of the current pattern, remain in an unknown state. That is problematic because the current pattern can activate also close enough neurons that happened to be in a close to activation state… Inhibitory neurons cannot inhibit neurons that activate at at the same time.. So I figured I need a refractory period… after a firing event, synaptic strength should be decreased to a defined level. This in turn would create chaos within the next layer, because some cycles would bring zero signal while the previous layers is in a refractory cycle.. Maybe next layer should work on a different frequency … should update slower… Well, I’m sure there are other problems that will show up and I don’t like having to deal with the firing rate, that would add additional complexity making things even harder to understand.

Am I sure that this is the desired behavior ? Not at all, but unless I have this working perfect, I cannot confirm or discard the hypothesis.. Right now, I’m not getting stable patterns (good or bad). The response depends on the previous pattern to some extent. If previous pattern was far, I get a (stable) response, if it was close, I get a different response.

Brain waves

Since my previous post, I focused only on understanding the source of asymmetries from my system. I disabled all secondary dendrites, because they are very hard to predict, but no luck. Then I thought maybe the inhibitory neurons are to blame, because I’m not so sure what should happen with individual synapses when their dendrite is inhibited. Should they still be under the LTP and LTD mechanisms ? Maybe, maybe not.

So after some adjustments in the inhibitory mechanism I decided to remove inhibition entirely. Yet, the asymmetries were still there. What I discovered from plotting potentials from individual neurons, was that they have a wave like behavior and they can get out of phase….. So two neurons would increase potential while the third would be in a decreasing phase…

Two neurons out of phase

Why this happens ? Because each neuron can be part of multiple patterns … So pattern A would activate neuron 1 but not neurons 2 and 3. When I switch to pattern B involving all 3 neurons, neuron 1 is in a different phase from neuron 2 and 3 , resulting in a non activation event… I’m assuming this is where small basket cells or chandelier cell may come into play, to regulate this out of phase situation… I’m unsure of what to do next… I can squash the waves easily enough when starting a new pattern, but this would only help in short term, wont work for a moving pattern… But most likely I will take the easy way out and see what happens later on..

With that said, brain waves might be real after all ๐Ÿ™‚

What is the basis for “invariance” ?

When I started this project, I thought the invariance comes from the properties of complex cells, to respond to multiple similar signals (for example to respond to all vertical lines from their respective receptive fields)… I tried to simulate how this would work but my final conclusion was not that final… It seemed that it would not work, but I could never do a good enough simulation to say for sure.

Another way of obtaining invariance might be the movement of the eye, focus, so the invariance would not be created in the brain but in the eye itself.. This still may be the case, I have not explored this option at all.

But from my latest blunders it seems the mechanism of invariance might be something very peculiar… It may come from inhibition.. While small patterns behave more or less predictable, bigger patterns have become totally unpredictable due to multiple overlapping inhibitions.. The number of neurons activating is not proportional to the input signal.. One might see more firing for smaller patterns than from bigger patterns.. Very strange indeed..

Stuck in training…

I don’t have a coherent theory about training.. Sure I could get a single letter to work, but the second one is not certain, unless I understand what I’m looking for..

I tried already hundreds of patterns and I’m no closer to understanding what would be the best output. I tried small patterns, big patterns… nothing really helped..

First steps in invariance for size and position

After many trials and errors I’m starting to see the light… First signs of invariance, and how that might work, are here..

It’s a very small matrix with only 2 layers (in fact there are 8 layers if I count input/output and inhibitory layers), 4×4. If pattern is moved on the right side, then another set of inhibitory neurons control the output, and the response is very different, like having an entirely different pattern. I was theorizing that this is how it should behave, but I was never very sure. Also I observed clustering that I have never predicted, but in hindsight I should have. They were messing my beautiful patterns.. so I inhibited the secondary dendrites as well… all possible dendrites are now controlled by inhibitory neurons , still I see asymmetries that are troublesome.

Can this be coincidental ? Yes and no, yes because the patterns are not perfectly as predicted, no, because is close to what I was predicting, and prediction show that it should work.

Anyway some things are clear… This invariance, can only work on limited fields. Theoretically the size of the field is directly proportional to the number of hidden layers (the bigger the field, the more hidden layers). And this is where I’m going next… But there still work to be done here on this tiny set-up.