Size and position invariance achieved (partially)

In the demo below, some things are choreographed (as in they are not entirely what they seem to be):

  • Colors are fake, they are converted to shades of gray, to get “true” color neurons specific for RGB are needed
  • Background was removed due to some unspecific interference. It should have worked and works for the majority of patterns but not for all. I’m not sure where the error is coming from.
  • While position seems 100% invariant, it is not. To get direction, I programmed receptive fields like areas. This results in a heterogeneous pixel output, so I drew the lines on the upper and left side of a receptive fields, but this is not seen in the displayed image. To get to true invariance an additional level of abstraction is needed .
  • The image is a 70 x 70 pixel, but since each receptive field is a 7×7 area, what it seen nowappears as a 10×10 pixel image.

I’ve made progress with the LTP/D algorithm, seems perfect now. I ran into problems with the Inhibitory neuron.. I’m not sure what is supposed to accomplished. I ran into problems with the dynamic synaptic connection, so in this video I have full connect.

A new year 2024 !

As predicted at the beginning of 2023, I did not make any significant progress in 2023 :(. It did not seem feasible but I was still hoping . So what did I do in 2023 ?

  1. I found a feasible mechanism for learning, but since learning is not a single step process, would be better to rephrase and say that I only have a step of the process of learning, however I consider that to be a significant step. That would have been significant progress if I could test it properly. Identifying 2 inputs in a simplistic set up is not even proof enough that the theory works, yet I still think it does.
  2. I now have a theory for what should LTP/D accomplish. I spent more then 3 months only on this, but I’m nowhere close to proving it as being good or bad. As far as I can tell, this is a must, because high frequency and multiple synapses are making the output unreliable.

So in terms of algorithms, that’s it… Does not seem much, the year has gone by very fast it seems. In terms of programming I made more progress, but programming it’s an empty shell. I moved to cython and dearpygui and also created an additional GUI interface to work with very few neurons, but in a more easy to understand way.

What’s the plan for 2024 ?

I need to clarify LTP/D.. IF I can do this, then I can do many more interesting things, even have significant progress.. Yup… I think this year is possible to have significant progress… But I will not do anything else till I have this done. I gave up on all the other possible approaches.. no more programming, no more network. Only a single neuron with different number of inputs and different frequencies..

time summation … why ?

there are cases where the pre-synaptic neuron, fires twice before the postsynaptic fires once…. There are at least 3 cases to be discussed but the main question is why should this be an acceptable behavior ? This leads to cases where the time between activation of pre and post synaptic neuron is relatively big… I believe LTP (long term potentiation) is meant to to deal with precisely this case.. Then we have the opposite … summation of thousands of synapses … this should be where LTD comes to help.. I’ve done all the simulations I can with current model and the preliminary conclusion is that the synapse is in fact undecided …. I could not find any scenario where the result is certain.. If input varies wildly all variables of a synapse (4 in total) change the synapse from “increased strength” to ” decreased strength ” and vice versa … I understood that a biological neuron cannot have in infinite number of synapses… but certainly a digital one, would not be bound by such non-sense limitations … Not true… with current variables, my model is limited to about 80 synapses per neuron … this number is somewhat arbitrary, but there is no theoretical option that this number could be infinity… The time summation is not an option either… if neuron cannot adjust (because the presynaptic fires too frequent) so we have one to one, pre and post -synaptic firing relation ship, that synapse breaks … which is observed both in biology and in my simulations… So I’m left with no good options… I need to find a balance and to limit accordingly the input values … but so far I failed … I was convinced that I could find a definite solution ..

How is foreground selected ?

Running actual images where the background is a thing put all my theories on ice.. I first realized that when I have many synapses for a neuron and few for another, it modifies the activation phase …d’ooohh… and I had to bring back my old nemesis, frequency. Frequency means now that a neuron can receive multiple pre-synaptic inputs before it fires once. I discarded this idea not only because it complicates matter, but I though this should not be the default mechanism, because would be a waste of energy.. Besides with thousands of synapses is not very likely for a neuron to wait for a secondary activation to do a time summation for activation. Anyway, I thought by bringing frequency back, the very bright neuron would win the inhibition battle.. Not likely, multiple synapses would still win. The consequence is that the background is selected instead of the foreground.. because it’s big 🙂 . How are CNNs dealing with this issue ? I’ll have to look into it. I have thought long and hard… I see no way around this, because in a way is the expected behavior given my algorithms… But, as with many other problems, I have a general solution.. I’m going to ignore it… I’ll chose a very low intensity background, I’ll avoid using both ON and OFF bipolar cells from the receptive fields and go with this. Sure I can’t have real images now, but maybe along the way I’ll find a solution or a solution would present itself..

Going somewhere

All small parts seem to be working, I have gotten small invariance for position and size and now I’m working to put them all together and hopefully have some sort of object identification. Soon I hope, I’m still dealing with some bugs and some things are working for the wrong reason and perhaps I’m procrastinating too, because I’m afraid that I’m missing something and it will not work..

Why this uncertainty ? isn’t all math ? There is the deep issue that I don’t see how this is working … In small simulations I know for sure what I’m doing but after adding some complexity and multiple layers it’s impossible to predict how thing are going to turn out… much like the current AI algorithms. I’m sure Google team that released the “Transformer” algorithm did not see ChatGPT in the future. Imagine Google having exclusivity on the Transformer … being the only one with generative AI…

I’ve been thinking long and hard of what to do next, say I get to identify objects.. what next ? I have many unexplored ideas such as linking unrelated objects as being the same, such as “A” and “a” or even the sound of “A”.. But this is not adding much value to the whole. The value should only come from an AI that can “reason”.. can plan.. I found it impossible to make the AI algorithm understand things that don’t come from outside as information. If I were to say : “Move the mouse cursor 2 pixels to the left”, most difficult is “Move”… move is nothing, while the others are something.. How can I convey that IT should do something ? I can of course cheat, program in some key word, once detected the word “Move” will trigger a scripted action.. But that is zero value to my mind.. So in fact I don’t see a way past data processing (image identification, sound, anything that can be converted in some arbitrary numbers)…

Can it work as a chatbot ? No… ChatGPT is an anomaly in the world of AI.. I agree that our brain also relies on statistics but if some one says ” Move the mouse cursor 2 pixels to the left”, my brain does a lot of processing .. it has to identify that I’m the one that should do the action by analyzing the environment, the context.. Am I close to a computer ? I’m I the only one in the room ? Do I want to answer this request, is it in my best interest to do so ? Only after many processing I will do something…. ChatGPT avoids all this hard processing and relies on what has learned as statistics from examples to formulate and answer immediately. Still in some parts, we as humans, work as ChatGPT… respond with learned answers without any other type of processing .. In this context “learn” is an arbitrary association in between parts of information (aka symbols). Much like “A” = “a” ..

So after I’m done with object recognition, I will seek help, some sort of collaboration with whomever might be interested …

Frequency is trouble

I coded some parts of the retina circuitry, now I have receptive fields, but still no color vision. Receptive fields are not as selective as they could be, but should be enough for now. Images are transformed to gray because I found no easy way to add in RGB data.

I also did some simulations to understand CA1 region but I still see no point in that complex arrangement. I did find something useful, a way to reduce phase difference. My focus was to get as much phase difference as I could, but then I realized that I also need to reduce the phase difference, somehow. So after 4 layers, with current set up, I get the same phase.

Anyway I one again got stuck in dealing with frequency. Different frequencies start from the receptor, eventually break synapses and I found no way to have a single frequency at time t for all regions from the visual field.. Unless I impose a single unchanging frequency. As far as I can tell frequencies play an important role so there should be a way to let them in. It seems the amacrine cells correlate frequencies for the bipolar cells they modulate, but I could not find this to be true in literature. There is very little information regarding amacrine cells.

what is learning #2 ?

A way to receive data, a way to discriminate data, a way to store and retrieve, a criteria to select what should be stored for later retrial.

From all of the above steps as far as I can tell I only have the discrimination algorithms, which in fact depends on the received data and on the selection of what should be stored algorithms. So it’s incomplete or it lacks complete parametrizations.

I worked on data receiver, converting visual data (RGB pixel values) into data that can be accepted by the discrimination algorithm. For color discrimination just converting a number into a different number through some function is enough. But to detect lines of different inclinations, colors, dimensions.. that is not enough.. So I started implementing the “receptive fields” concepts from retina, with the ON and OFF behavior and center/surround. I have yet to find relevant details for implementation. So I guess is trial and error again. For example I have found no details of the size of the center vs surround. Small center, makes it hard for light to hit it, to hit only the center and not the surround, so ON centers are darker than expected and OFF center brighter… But this results in good discrimination for line inclination .. There are also contradictory information about horizontal cells .

I have looked at a way to store data, CA1 region from the hippocampal circuit seems interesting because of its configuration. But even the simplest simulation shows that a neuron cannot link back into itself… But maybe I’m missing something. Anyway that seemed like an interesting way to discriminate small similar patterns.

Data retrieval remains mysterious, it seems we retrieve data by recreating it in an altered version.

How to select data to be stored ? This is an area for which I have no idea… But for now this can be bypassed the easiest.. I tell the system when to store that data. The system stores automatically data that passes some abstract threshold, but there are trade offs with this approach, so in fact most of the data does not reach that threshold.

Anyway, “learning” remains open for debate.

My progress has been slow, there are things that I can do and I know how to do, but I lack motivation to do them because they don’t seem important. And then there is an infinite list of mostly poorly defined tasks.

a very small update..

I now have the code and the proof that my new learning theory works. I was very anxious till the very end because most of the time my theoretical predictions failed when put to the test. I would make a theoretical calculation, but using many approximations and when everything will be put into code and graphs, I would find some unpleasant surprises :).

This is the set-up used for the 2 color separation. I spent couple of weeks adding this new GUI, so I can inspect each neuron in much more detail.. abandoning for now the complex GUI used for many neurons working in a complex network.

Next I want to separate multiple colors and create a proper presentation with couple of explanations perhaps … or maybe just a video showing the action…

Still haven’t solved the problem of “embedding” for colors… I have a function that takes as an input an RGB value and converts it into a “phase” number, that can be used by the AI algorithm, but is not good enough because is not working for all colors… Some colors translate into very high or very low phase numbers so they cannot be used by the algorithm .. So I’m not sure yet how to show “learning” for all colors.. But more than 2 colors should be a good start…

Molecular descriptors, embedding and frequency

I was discussing how my friend used molecular descriptors and some form of backpropagation to find “similar” molecules for specific purposes. The process was similar or identical (if considered in general terms) with the “embedding” process in AI.. “Embedding” is supposed to convert a type of information in a different type so in the end apples and oranges can be compared. Restricting the output of e perceptron to 0 and 1 or -1 to 1 is also a way to make data comparable still an embedding problem.. Our brain seems to have found the ultimate embedding, transforming all data from all sources, in frequency (and phase) so everything can be compared.

I’m struggling with converting an image data (say RGB values) into something that can be used by the AI algorithm. If information is not converted “properly” (what’s proper is unknown) then the algorithm will fail to detect differences in the input data, fail to learn anything… which may actually be the reason why I spent so much time with no progress what so ever…

I’m still far away from showing anything concrete. My plan was to create a test program that shows how two colors can be learned, but I took a very very twisted approach and now I’m further away from my goal then when I started. Why ? Not sure.. Maybe I believe this is more complex than on paper ? Maybe I believe that even if this works is not proving anything ? So by observing my approach I must conclude that I’m getting ready for multiple problems which cannot be solved unless I understand very fast what the real problem is.. So I’ve been motivated only to construct more an more tools to better visualize the data and eliminate ambiguous options … and still hesitate to take more decisive steps.

what is enough ?

I’ve been making some unexpected progress, I found a learning mechanism which is both simple and reliable and integrate 100% with all of my other ideas. Now, I’m again building a small test to show learning of colors with 3 input neurons. It’s taking a lot of time because since I changed the code to incorporate more C/C++, parts of the code are not working properly or failing completely… So perhaps another week maybe two till I can show my idea in practice. All the test I’ve done so far are in a sort of a manual network, I link by hand neurons, initiate them one by one and such..

Anyway the problem I have now is when to stop “learning”.. The way it works now is as follow : the network learns by itself up to values, then if I want to separate patterns and learn them separately even further, I have to tell it to learn.. Learning behaves like dopamine or serotonin influx, so some constants are altered by an external (or different) mechanism.. Trouble is that this learning would go on till the very limits of the input data.. Assume we have a 10 synapse pattern.. While I see it as a pattern, if I decide, I could also go deeper (up to my visual acuity) with dissecting that pattern into smaller patterns… So I may be able to discard 5 synapses (points) from that pattern and consider they do not meet all criteria to be part of that pattern… And then to the best of my sensory perception I can’t find other differences among the remaining 5 synapses.. My AI does just that at this point… discards everything up to those 5 remaining patterns because I’m not sure how to define criteria that would stop it before reaching that very end.. So how do we know enough is enough ? How do we decide what to learn ?

Anyway, this is now a very very different problem than the ones I had so far.. The important point is that under clearly defined conditions the system has a mechanism of learning. This is all very new to me, so I may get some ideas later on. Right now I’m focusing on building the whole system back to its original functionality and building the color discrimination demo along the way.