The Robots Are Coming: Do You See What I See?

c3pO_head.jpg

Getting machines to see and react to the things that we see and react to is quite rightly considered an important step along the path to true artificial intelligence.

However as new research at the University of Toronto by Amir Rosenfeld, Richard Zemel, and John K. Tsotsos, showed last month, visual recognition systems can be fooled and indeed broken, simply by introducing strange variables into a familiar scene.

In their paper Elephant In The Room Rosenfeld and his team highlight some of the fallacies of object detection systems being developed today. Whereby they literally (and I do mean literally) introduce an elephant into a room to see how the object detection systems deal with this new variable.

Answer: Not very well at all . . .

Vision Illusion

You may have heard that human vision is not perfect. Even if unlike me, you don't need glasses and have 20/20 vision or better (yes there is better!).

In fact it turns out that a huge part of our vision is filled in by our imaginations.

Do me a favour, stop reading this for a few seconds and look around the room you're in now, I'll wait for you . . .

OK, notice anything strange?

No, of course you didn't, you saw exactly what you expected to see. But what if I told you that a good percentage, perhaps ten or more percent of what you just looked at, is not really there.

Look at the image below.

Cover your right eye and use your left eye to look at the dot on the right hand side. If you're viewing this on a phone it may not be big enough, however you can recreate the trick with a bit of paper and a pen.

OK, so now you are looking at the right dot, with your left eye, making sure your right side is covered. Slowly bring your head closer to the screen.

disappearing_dot_blindspot.jpg

Pow!

Did you see that?

The dot on the left disappeared! and was replaced by the white paper. How can this be?

The dot obviously does not really disappear, and to see white instead of black does not make sense because where is the input coming from?

The answer is quite simply that we have a blind spot, and rather than just have a grey band or splodge in its place. Our rather helpful brains create a scene that they think should be there. This helps give the illusion of an unbroken field of vision.

note: for extra freakiness, put a finger where the dot should be and watch the top of it disappear!

Opticians For Robots

The first and perhaps most obvious application for machine sight is in self-driving cars. Google and Tesla amongst others, are creating vehicles that will be able to drive us back from the bar in complete safety, while we sit on the back seat drunk as skunks.

However before that can happen, they have to be at least as good as human vision. In fact we will hold robot cars to much higher standards than humans.

According to the UK road safety charity, Brake, road crashes involving a driver with poor vision are estimated to cause 2,900 casualties and cost £33 million ($42 million) in the UK per year.

Let's face it, if just one self-driving car has a single accident because of poor vision, the entire program will probably be scrapped. Therefore it is important that we get these systems right.

The problem is; how are we meant to give robots perfect vision when we don't have it ourselves?

Growing And Seeing

If you have ever hung out with a newborn baby for any length of time, you may have noticed that they have zero depth perception.

Whilst the baby is programmed to recognise its mother, her face must look very strange to the child. All of her features will appear to be on a single plain. We know this by watching babies trying to grab mobiles placed above their cots. This is because their tiny minds have not yet worked out the difference between things that are small, and those that are far away.

Distinguishing one object from another, working out where one begins and the other ends, must also be an early challenge for a child that has no historical visual reference points to work with.

Of course a child does eventually learn how to see, and how to use what it's seeing in a meaningful way. However this is just a working model that allows the child to navigate and interact with the world. It is by no means perfect, as proven by optical illusions and momentary (non-drug induced) hallucinations.

Viewing Through A Child's Eye

If we are to teach robots to see, then I believe that we have to do it in the same way we teach children to see.

In some parts the scientists have got this right. They take a system, give it clever code and a camera, then they show it objects and tell it what those objects are.

The very first systems that did this were crude and could only recognise objects from a given angle. So if you showed the system a chair and then slightly rotated that chair, it would completely crash the system.

Today visual recognition systems are much better, now when you show one a chair, not only can you move the chair around and it will still recognise it. You can also show it chairs with different designs and it will still recognise them as chairs.

However as Feldman and his team showed in their study, if you introduce a toy elephant into a room scene that the computer had already viewed. It completely shocked its system and it started to unrecognise objects that it had previously identified.

My theory is this is happening because at the moment, we are teaching computers to see in a very one dimensional way. We show them an object and then tell them what that object is.

They in turn take various grid reference points as they look at the object, and store it in their vast memories. They then are shown other versions of the object, so that they can get an overall template of what the thing is.

However this is not how a child learns to see. A child learns using touch, feel, taste and the reactions of adults.

For instance, I bet you have a good idea what the screen you're looking at now tastes like. Look around you, take in the objects you're surrounded by and imagine what they taste like.

I'll bet you a penny to a pound, that you are accurate to within 95% maybe even more. Yet you don't remember tasting any of these objects right? (I just licked the screen on my phone and the protective case, and they pretty much tasted as I expected.)

This is because as a child, you went through an oral phase whereby almost any new object you came across went straight into your mouth.

Then using touch and ambient radiation, you started to understand the difference between hard and soft, sharp and blunt, hot and cold.

The reactions of your parents also helped you. Most children by the time they can walk know the word hot. As their parents have told, and demonstrated to them what will happen if they touch the oven while it's on.

Lastly you have a specialised area (two actually) in the brain called the fusiform gyrus, which has specifically evolved to recognise faces. This helped you to relate to certain objects in subtle ways. For instance a child will be drawn more to an object which looks like a face, than one that doesn't.

Gazing Into An A.I. Future

So what does all this mean for our robot cousins?

Quite simply we have to find a way to mimic these early stages of childhood, specifically in the area of touch and taste, to go along with the language we are teaching them.

Clearly we don't also want to program in our visual failings into the final model, however perhaps understanding the flawed way in which we see would be helpful to an artificial intelligence trying to make sense of our world.

For now though we will keep teaching our systems to see by using a methodical layering and listing technique. Which will work fine, for the most part. Just make sure you don't show any elephants to our robot cousins, not yet anyway, they're just not quite ready to handle it yet.

Sources & further reading

Elephant In The Room - .pdf study results

Machine Learning Confronts The Elephant In The Room - Quanta Magazine

Discussion On Elephant In The Room research - Reddit science thread

Road Traffic Statistics In Relation To Eyesight - Brake charity

Blindspot Demonstration - Google search

WHERE DO YOU THINK VISUAL RECOGNITION SYSTEMS WILL GO IN THE FUTURE? DO YOU HAVE ANY IDEAS ON HOW THEY COULD BE MADE BETTER? PERHAPS YOU WORK IN THAT INDUSTRY AND UNDERSTAND THE COMPLEXITIES OF THE PROBLEM? AS EVER, LET ME KNOW BELOW!

Title image: Jens Johnsson on Unsplash

Cryptogee


Meet me at SteemFest 2018 in Kraków

Sort:  




This post has been voted on by the SteemSTEM curation team and voting trail in collaboration with @curie.

If you appreciate the work we are doing then consider voting both projects for witness by selecting stem.witness and curie!

For additional information please join us on the SteemSTEM discord and to get to know the rest of the community!

Thanks you for this article! In the eye of technology, the entire natural Human body will always be considered a flaw..

Quite creepy when you imagine how it will look like some day when prostheses will no longer be "emergency parts" in case you lose your arm or leg, but some very natural body enhancements all people deliberately choose to implement by way of hypertechnological surgeries. It is only a question of time until this will become a very ubiquitous trend, especially in a world where sciences are continuously working on extending Human life span...

I'm living in a third world, but having this kind of expectation for the future makes it an exciting future for us all...smile

If their is any way to make our live more better, it has to do with the introduction of the robot to handle various aspects of our daily activities, if they are well programmed, they can do much more better than human.

My contribution to the post is as follows, just one means of recognition should not be used to program the robot, it must be combination of various means of recognition such as sound, touch, taste, smell and the likes (just like the five senses in man) . I think this would make it work more better and accurate.

It's been a long time.....

Interesting topic. Does it mean that AI has come to a point in which a robot can be “raised” as a child? It is both a scary and exiting perspective.

Congratulations @cryptogee! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

Award for the number of upvotes received

Click here to view your Board of Honor
If you no longer want to receive notifications, reply to this comment with the word STOP

To support your work, I also upvoted your post!

Do not miss the last post from @steemitboard:

SteemitBoard Ranking update - Resteem and Resteemed added

Support SteemitBoard's project! Vote for its witness and get one more award!

Pretty good optical illusion!