Thoughts on AI: "The W know first is the W don't. "

That's what the situation in the middle east is a case of, I've heard it many times on the news.

Needless to say, my ears have been having problems. Doctor says its an infection, with a wax buildup. The interesting thing is that it makes you a little dizzy, it makes you more awkward even though your eyes are fine.

Dylan Ratigan has apparently decided to have a 21st century news program, which is great - his last show reported on energy issues and robot vision. I wanted to comment on the latter. In the last post I didn't publish, I wrote:

Modect stereoview.

Basically, you use motion events off still background to enforce 3d motion detection zones in stereo security cameras. Because due to the limited nature of stereo motion events in each frame, location is computationally cheap using old fashioned overlay. (where in one picture is inverted with 50% alpha and moved pixel by pixel over the other with contiguous grey areas as a match for that layer of depth.) Simple, effective. Fun.

This was about using two cameras to get a 3D field of view, like our eyes do. The algorithm I talked about above is good because you can set it up to only capture the kids when they get on your lawn (Thus triggering the "get off my lawn" senior citizen bot to chase them off) but it doesn't trigger when they are on the street behind, because it has 3d awareness of where things are. Since different layers of depth correspond to overlay offsets, the algorithm for detecting things only needs to compute a limited number offsets to determine if motion events are in the zones it cares about.

But the broader issue I want to write about is the generalized "synaesthesia" of the brain, how integral combining data from numerous inputs is to its function. To wit: Cyclopes fail. Nature does produce them, the disease is called Cyclopia (I read about it the other night.) And they don't make it. The Great Scientist tried this experiment long before the first human scientists even existed, and it didn't work. Nature gives large animals at least two eyes, and more to smaller animals like spiders. The great achievement of the brain is combing these two inputs into a unified experience.

Second, there are more than two inputs. I have been stumbling because my ears are sick, so there is some kind of accelerometer in there that helps me balance, notes my movements. This aids my eyes in constructing the scene I am in.

Third, there is more than stereo scopic stuff going on in depth detection. If I close one eye and look at my finger before my face, I can detect its depth by focusing on the finger (making the background blur) and then focusing on the background (making my finger blur)

4th, there are zones my brain deals with, 3 to be precise. 1) Far away (stereo vision can't differentiate between distant objects, a cloud looks as far away as the moon as the sun, as the stars) 2) medium (where stereo vision works) and 3) close where stereo vision breaks down and I rely more on focus, or a "holy shit" response is triggered, such as a bird flying 3 inches in front of my face will make me close my eyes and pull away.

The data structure about the space we are observing is probabilistic. We go with the best model until new data overthrows that model, makes it improbable. Observe it in your own brain here:

http://www.youtube.com/watch?v=XyALAuKiKug

You think one thing until new data updates your spacial model.

The fact that we get a good 3d sense of a scene from a 2d movie, like in the video above, shows that the majority of our spacial perception is done with correlating data not between two separate still stereo frames, but between what we see now and what we saw a spit second ago and what we saw a split second before that. But I think that the two eyes are the way we train that. I don't think a cyclops would develop that as well. Its like "If I see in my right eye what I saw in my left eye a second ago, I must have moved a few inches to the left." The brain gets its core training from stereo vision.

I was able to construct 3d scenes which stereo vision could not make a model of, its easy and their name is legion. One of the funnest is to exploit the correlation problem's weaknesses, to cause the brain to make false correlations. Magic Eye pictures do that. Your brain will assume a repeating pattern when there actually isn't one, and will falsely correlate one tiny jellybean with another just like it in the other eye, when they aren't actually the same, so they create the illusion of depth where there is none.

Here is a 3d magic eye image showing the image offset method to reveal its 3d secret. Original (larger here if you want to actually do it):

Offset:

Notice contiguous grey area in middle, corresponding to the field of depth represented by the offset you can see on the left. (I really love my Brother MFC-240 btw. Great printers.)

So what you have here is a situation where if you put the image and its inversion together with no offset, its totally grey. But you also get grey with other offsets, so there are many possible spacial constructions. You brain PICKS ONE AND ONLY ONE for the current spacial model, but you can have it "pop" like a necker cube into the different view. So what you are talking about an edward scissorhands situation, you have a probability cloud and each new observation trims that bush down into the shapes all around. So you really want to throw everything and the kitchen sink at that cloud. As many eyes as you can, past frames, accelerometer readings, echolocation, audio triangulation, anything you can think of. The wonderful thing about nature is how dual use everything is: We think the bird is making a mating call, but that call also serves as a

1) self health check

2) echolocation environment update

3) alert to other males

4) test for predators when in optimal state to get away.

etc.

If my robot doesn't tweet like R2-D2, I'll know the designers gave up a lot of good possible data on space and material hardness data. (hard things echo better)

But anyway, its all about the Magic Eye/ necker cube pop. The brain instantly updates, it doesn't take a bunch of time to tear down its existing probability model and build a new one. Its like the secondary model was almost already built in the background, ready to go. Look at the picture above. The brain is a consistency finding machine, good at getting rid of noise. It pays attention to the consistently grey areas and gets rid of the noisy junk. So it must be with probability models. Consistent alternatives to the model perceived are calculated in the background, but those that have too many contradictions, those that can't be true are discarded with the noise. It has Occam's razor features, it looks for the simplest models and discards the rest. Once the dominant model becomes more complex than an alternative, the alternative becomes dominant. It works off of what it knows, but what it knows can be discarded at any time for a predefined alternative.

In other words, taking it back to the title, its a case of "the devil you know verses the devil you don't". If you know their both devils, then you know something about both already, though its the one you know the most about already that you claim to "know". These levels of knowledge between the subconsciously known and consciously known seem to be a really important part of all thought.

Figuring out what a framework like this looks like in computational terms is one of the great and fun challenges for robot vision folks, who I believe will be laying the foundations for real AI, simply because they are following the incremental development footsteps of the Great Scientist, who gave Her creations sight and sound a billion years before giving them language and abstract thought. Much smarter than not trying to start at the final product, the human mind.

Thoughts on AI

About Me

Saturday, March 26, 2011

"The W know first is the W don't. "

0 Comments: