I thought this to be straight forward, but I’ve been stuck on this issue for a long time now. Assume you have a single synapse and add a second one with exactly the same properties. The signal doubles and the time to neuronal activation is cut in half (something that I don’t believe it to be true). I’ll refer to the time since receiving the first activation signal till neuronal activation as: cycles to activation or for short cta. Changing one of the synapses somehow will result in complications. Assume synapse S1 by itself will fire the neuron at a cta = x. Altering the synapse will result in a cta = y. Then when combining synapse S1 with cta = y with a synapse S2 with cta = x, what should be the resulting cta ? That is not clear at all. It can’t be an average time, because the weak synapses will have too much sway over the stronger synapses. My current model, dealt with this in a precise theoretical manner. But it fails when synapses are not equal. The adapting mechanism eliminates the weak synapse. I put in some conditions that would stop the process of elimination and apparently was working well, but was not 100% consistent. That did not bother me that much until I started doing more complex simulations.. Then it mattered. Not having consistency in latency, resulted in poor outcomes. After much time spent, I concluded that there is no theoretical way of establishing what should the result be when combining cta = x with cta = y. The biological neuron seems to settle this debate through some constants, there are various ratios between Na + and Ca2+ entering the membrane of the post-synaptic neuron, number of AMPA / NMDA receptors, the NMDA receptor is slower than the AMPA receptor. All this (and others) lead me to believe that there cannot be a beautiful kinetic model that would rule everything. The combination of different cta’s most likely is not precise (as in [Ca2+] is NOT linearly correlated with the [AMPA receptor] ) but must be consistent, which is what’s important.
Structure input or hidden layer ?
In the past I tried to structure input by inputting lines of fixed length like shown here:
but that seem tedious and had also some hidden problems.
Now, I have to have some initial activation in all layer before any links are formed with the previous layer.. They are random now:
but maybe I can make them not random, so I obtain some structure even before the fist input is seen. I have played with this idea in the past, starting a wave from a corner, but I don’t even remember why I abandoned it..
That’s because I can’t envision any way to have a structured input. In a way I need to break the symmetry so A can be bigger than B as described in my previous post. I’ll start waves from the center of the image field. Not sure if I need to do this in all layers or only in the first hidden layer. I’m thinking that if I do it in the first layer, then the second layer will already have structured input, otherwise I risk some sort of opposing training in first layer vs second layer..
What is βinputβ ? – take two.
My neurons work well enough (99% of the time), doing what I wanted from the beginning, yet I still see no complex correlations. Why ? Because theoretically is not possible to have such a random input and get order from there… I have tested multiple hypothesis, frequency , phase surface distribution, different leakage with distance, multiple models for inhibition… Nothing lead me to believe that they could work.. Synapses break because the input is too different from “receptor” to “receptor”.. So I have come to the conclusion that the input must be structured somehow … Meaning some receptor fields have to respond the same, regardless of the particular wavelength, brightness, direction, they receive… Maybe respond with an average response, min, max… does not matter, but it has to be the same.. Also is not possible to have large differences in latency among close receptors and then neurons.. The latency it’s a compounded variable that contains data about frequency, wavelength and direction. If the resulting latency is bigger than the time it take for a neuron to fire, that synapse has no contribution to the activation of that neuron so it should be removed.. But such large latency is the norm in my system.. This also has to stop..
But what kind of structure should I aim for ? Not sure at all…
Timing is everything …
I have my very first logic rule: If A > B then A =1 and B =1, if B>A then A=0 and B =1. I’m not sure what to make of it. More important I cannot find any reason for why A is bigger than B in the first place, what creates this initial asymmetry.. To reach that logic end, I need perfect timing and that has proven very difficult.. Even a cycle off and the whole network loses coherence. I’m not sure what to make of this particular problem .. Is our brain so precise ? It could be because in the end, the computation is done by the electrostatic potential which is the definition of precision… But I’m not convinced, if there are correction mechanisms, what are they ? So I need inhibition to hit at a very precise time in all layers in all cycles and it doesn’t. On the other hand why do I have this imprecision ? When adding all synapses I get a value that is not precisely the activation potential (AP), is slightly higher, when trying to redistribute that error on all synapses I fail to account for their precise contribution. Everything I tried, failed, so the system works 99.9% as expected, but that is not enough. I still get cascade failure where all neurons get stuck in a dying state.
Anyway, the progress is very very slow, so now I’m not sure any more that I’ll have some significant progress by the end of the year π
Abandoning frequency
I tried everything I could think of. Once in the neural net, I could not find a way to control frequency. I cannot add, subtract, average, min, max different frequencies. So I’m abandoning this thread for now. I’m still not 100% convinced, but maybe I will never be 100% convinced of anything. The neuron can work withing a broad range of frequencies, but cannot work with synapses running at different frequencies. Perhaps horizontal / amacrine / bipolar cells together work to eliminate completely the multiple frequencies generated by the receptor cells and average to a single frequency per frame… I thought they only minimize frequency difference among different regions but maybe they go all the way and eliminate it entirely. Could be possible.
So once again, I’ll consider only latency.
To summarize : I made no progress π
Size and position invariance achieved (partially)
In the demo below, some things are choreographed (as in they are not entirely what they seem to be):
- Colors are fake, they are converted to shades of gray, to get “true” color neurons specific for RGB are needed
- Background was removed due to some unspecific interference. It should have worked and works for the majority of patterns but not for all. I’m not sure where the error is coming from.
- While position seems 100% invariant, it is not. To get direction, I programmed receptive fields like areas. This results in a heterogeneous pixel output, so I drew the lines on the upper and left side of a receptive fields, but this is not seen in the displayed image. To get to true invariance an additional level of abstraction is needed .
- The image is a 70 x 70 pixel, but since each receptive field is a 7×7 area, what it seen nowappears as a 10×10 pixel image.
I’ve made progress with the LTP/D algorithm, seems perfect now. I ran into problems with the Inhibitory neuron.. I’m not sure what is supposed to accomplished. I ran into problems with the dynamic synaptic connection, so in this video I have full connect.
A new year 2024 !
As predicted at the beginning of 2023, I did not make any significant progress in 2023 :(. It did not seem feasible but I was still hoping . So what did I do in 2023 ?
- I found a feasible mechanism for learning, but since learning is not a single step process, would be better to rephrase and say that I only have a step of the process of learning, however I consider that to be a significant step. That would have been significant progress if I could test it properly. Identifying 2 inputs in a simplistic set up is not even proof enough that the theory works, yet I still think it does.
- I now have a theory for what should LTP/D accomplish. I spent more then 3 months only on this, but I’m nowhere close to proving it as being good or bad. As far as I can tell, this is a must, because high frequency and multiple synapses are making the output unreliable.
So in terms of algorithms, that’s it… Does not seem much, the year has gone by very fast it seems. In terms of programming I made more progress, but programming it’s an empty shell. I moved to cython and dearpygui and also created an additional GUI interface to work with very few neurons, but in a more easy to understand way.
What’s the plan for 2024 ?
I need to clarify LTP/D.. IF I can do this, then I can do many more interesting things, even have significant progress.. Yup… I think this year is possible to have significant progress… But I will not do anything else till I have this done. I gave up on all the other possible approaches.. no more programming, no more network. Only a single neuron with different number of inputs and different frequencies..
time summation … why ?
there are cases where the pre-synaptic neuron, fires twice before the postsynaptic fires once…. There are at least 3 cases to be discussed but the main question is why should this be an acceptable behavior ? This leads to cases where the time between activation of pre and post synaptic neuron is relatively big… I believe LTP (long term potentiation) is meant to to deal with precisely this case.. Then we have the opposite … summation of thousands of synapses … this should be where LTD comes to help.. I’ve done all the simulations I can with current model and the preliminary conclusion is that the synapse is in fact undecided …. I could not find any scenario where the result is certain.. If input varies wildly all variables of a synapse (4 in total) change the synapse from “increased strength” to ” decreased strength ” and vice versa … I understood that a biological neuron cannot have in infinite number of synapses… but certainly a digital one, would not be bound by such non-sense limitations … Not true… with current variables, my model is limited to about 80 synapses per neuron … this number is somewhat arbitrary, but there is no theoretical option that this number could be infinity… The time summation is not an option either… if neuron cannot adjust (because the presynaptic fires too frequent) so we have one to one, pre and post -synaptic firing relation ship, that synapse breaks … which is observed both in biology and in my simulations… So I’m left with no good options… I need to find a balance and to limit accordingly the input values … but so far I failed … I was convinced that I could find a definite solution ..
How is foreground selected ?
Running actual images where the background is a thing put all my theories on ice.. I first realized that when I have many synapses for a neuron and few for another, it modifies the activation phase …d’ooohh… and I had to bring back my old nemesis, frequency. Frequency means now that a neuron can receive multiple pre-synaptic inputs before it fires once. I discarded this idea not only because it complicates matter, but I though this should not be the default mechanism, because would be a waste of energy.. Besides with thousands of synapses is not very likely for a neuron to wait for a secondary activation to do a time summation for activation. Anyway, I thought by bringing frequency back, the very bright neuron would win the inhibition battle.. Not likely, multiple synapses would still win. The consequence is that the background is selected instead of the foreground.. because it’s big π . How are CNNs dealing with this issue ? I’ll have to look into it. I have thought long and hard… I see no way around this, because in a way is the expected behavior given my algorithms… But, as with many other problems, I have a general solution.. I’m going to ignore it… I’ll chose a very low intensity background, I’ll avoid using both ON and OFF bipolar cells from the receptive fields and go with this. Sure I can’t have real images now, but maybe along the way I’ll find a solution or a solution would present itself..
Going somewhere
All small parts seem to be working, I have gotten small invariance for position and size and now I’m working to put them all together and hopefully have some sort of object identification. Soon I hope, I’m still dealing with some bugs and some things are working for the wrong reason and perhaps I’m procrastinating too, because I’m afraid that I’m missing something and it will not work..
Why this uncertainty ? isn’t all math ? There is the deep issue that I don’t see how this is working … In small simulations I know for sure what I’m doing but after adding some complexity and multiple layers it’s impossible to predict how thing are going to turn out… much like the current AI algorithms. I’m sure Google team that released the “Transformer” algorithm did not see ChatGPT in the future. Imagine Google having exclusivity on the Transformer … being the only one with generative AI…
I’ve been thinking long and hard of what to do next, say I get to identify objects.. what next ? I have many unexplored ideas such as linking unrelated objects as being the same, such as “A” and “a” or even the sound of “A”.. But this is not adding much value to the whole. The value should only come from an AI that can “reason”.. can plan.. I found it impossible to make the AI algorithm understand things that don’t come from outside as information. If I were to say : “Move the mouse cursor 2 pixels to the left”, most difficult is “Move”… move is nothing, while the others are something.. How can I convey that IT should do something ? I can of course cheat, program in some key word, once detected the word “Move” will trigger a scripted action.. But that is zero value to my mind.. So in fact I don’t see a way past data processing (image identification, sound, anything that can be converted in some arbitrary numbers)…
Can it work as a chatbot ? No… ChatGPT is an anomaly in the world of AI.. I agree that our brain also relies on statistics but if some one says ” Move the mouse cursor 2 pixels to the left”, my brain does a lot of processing .. it has to identify that I’m the one that should do the action by analyzing the environment, the context.. Am I close to a computer ? I’m I the only one in the room ? Do I want to answer this request, is it in my best interest to do so ? Only after many processing I will do something…. ChatGPT avoids all this hard processing and relies on what has learned as statistics from examples to formulate and answer immediately. Still in some parts, we as humans, work as ChatGPT… respond with learned answers without any other type of processing .. In this context “learn” is an arbitrary association in between parts of information (aka symbols). Much like “A” = “a” ..
So after I’m done with object recognition, I will seek help, some sort of collaboration with whomever might be interested …