Going somewhere

All small parts seem to be working, I have gotten small invariance for position and size and now I’m working to put them all together and hopefully have some sort of object identification. Soon I hope, I’m still dealing with some bugs and some things are working for the wrong reason and perhaps I’m procrastinating too, because I’m afraid that I’m missing something and it will not work..

Why this uncertainty ? isn’t all math ? There is the deep issue that I don’t see how this is working … In small simulations I know for sure what I’m doing but after adding some complexity and multiple layers it’s impossible to predict how thing are going to turn out… much like the current AI algorithms. I’m sure Google team that released the “Transformer” algorithm did not see ChatGPT in the future. Imagine Google having exclusivity on the Transformer … being the only one with generative AI…

I’ve been thinking long and hard of what to do next, say I get to identify objects.. what next ? I have many unexplored ideas such as linking unrelated objects as being the same, such as “A” and “a” or even the sound of “A”.. But this is not adding much value to the whole. The value should only come from an AI that can “reason”.. can plan.. I found it impossible to make the AI algorithm understand things that don’t come from outside as information. If I were to say : “Move the mouse cursor 2 pixels to the left”, most difficult is “Move”… move is nothing, while the others are something.. How can I convey that IT should do something ? I can of course cheat, program in some key word, once detected the word “Move” will trigger a scripted action.. But that is zero value to my mind.. So in fact I don’t see a way past data processing (image identification, sound, anything that can be converted in some arbitrary numbers)…

Can it work as a chatbot ? No… ChatGPT is an anomaly in the world of AI.. I agree that our brain also relies on statistics but if some one says ” Move the mouse cursor 2 pixels to the left”, my brain does a lot of processing .. it has to identify that I’m the one that should do the action by analyzing the environment, the context.. Am I close to a computer ? I’m I the only one in the room ? Do I want to answer this request, is it in my best interest to do so ? Only after many processing I will do something…. ChatGPT avoids all this hard processing and relies on what has learned as statistics from examples to formulate and answer immediately. Still in some parts, we as humans, work as ChatGPT… respond with learned answers without any other type of processing .. In this context “learn” is an arbitrary association in between parts of information (aka symbols). Much like “A” = “a” ..

So after I’m done with object recognition, I will seek help, some sort of collaboration with whomever might be interested …