Tuesday, July 29, 2014

On internal models, consequence engines and Popperian creatures

So. We've been busy in the lab the last few months. Really exciting. Let me explain.

For a couple of years I've been thinking about robots with internal models. Not internal models in the classical control-theory sense, but simulation based models; robots with a simulation of themselves and their environment inside themselves, where that environment could contain other robots or, more generally, dynamic actors. The robot would have, inside itself, a simulation of itself and the other things, including robots, in its environment. It takes a bit of getting your head round. But I'm convinced that this kind of internal model opens up all kinds of possibilities. Robots that can be safe, for instance, in unknown or unpredictable environments. Robots that can be ethical. Robot that are self-aware. And robots with artificial theory of mind.

I'd written and talked about these ideas but, until now, not had a chance to test them with real robots. But, between January and June the swarm robotics group was joined by Christian Blum, a PhD student from the cognitive robotics research group of the Humboldt University of Berlin. I suggested Christian work on an implementation on our e-puck robots and happily he was up for the challenge. And he succeeded. Christian, supported by my post-doc Research Fellow Wenguo, implemented what we call a Consequence Engine, running in real-time, on the e-puck robot.

Here is a block diagram. The idea is that for each possible next action of the robot, it simulates what would happen if the robot were to execute that action for real. This is the loop shown on the left. Then, the consequences of each of those next possible actions are evaluated. Those actions that have 'bad' consequences, for either the robot or other actors in its environment, are then inhibited.

This short summary hides alot of detail. But let me elaborate on two aspects. First, what do I mean by 'bad'? Well it depends on what capability we are trying to give the robot. If we're making a safer robot, 'bad' means 'unsafe'; if we're trying to build an ethical robot, 'bad' would mean something different - think of Asimov's laws of robotics. Or bad might simply mean 'not allowed' if we're building a robot whose behaviours are constrained by standards, like ISO 13482:2014.

Second, notice that the consequence engine is not controlling the robot. Instead it runs in parallel. Acting as a 'governor', it links with the robot controller's action selection mechanism, inhibiting those actions evaluated as somehow bad. Importantly the consequence engine doesn't tell the robot what to do, it tells it what not to do.

Running the open source 2D robot simulator Stage as its internal simulator our consequence engine runs at 2Hz, so every half a second it is able to simulate about 30 next possible actions and their consequences. The simulation budget allows us to simulate ahead around 70cm of motion for each of those next possible actions. In fact Stage is actually running on a laptop, linked to the robot over the fast WiFi LAN. But logically it is inside the robot. What's important here is the proof of principle.

Dan Dennett, in his remarkable book Darwin's Dangerous Idea, describes the Tower of Generate-and-Test; a conceptual model for the evolution of intelligence that has become known as Dennett's Tower.

In a nutshell Dennett's tower is set of conceptual creatures each one of which is successively more capable of reacting to (and hence surviving in) the world through having more sophisticated strategies for 'generating and testing' hypotheses about how to behave. Read chapter 13 of Darwin's Dangerous Idea for the full account, but there are some good précis to be found on the web; here's one. The first three storeys of Dennett's tower, starting on the ground floor, have:
  • Darwinian creatures have only natural selection as the generate and test mechanism, so mutation and selection is the only way that Darwinian creatures can adapt - individuals cannot.
  • Skinnerian creatures can learn but only by literally generating and testing all different possible actions then reinforcing the successful behaviour (which is ok providing you don't get eaten while testing a bad course of action).
  • Popperian creatures have the additional ability to internalise the possible actions so that some (the bad ones) are discarded before they are tried out for real.
Like the Tower of Hanoi each successive storey is smaller (a sub-set) of the storey below, thus all Skinnerian creatures are Darwinian, but only a sub-set of Darwinian creatures are Skinnerian and so on.

Our e-puck robot, with its consequence engine capable of generating and testing next possible actions, is an artificial Popperian Creature: a working model for studying this important kind of intelligence.

In my next blog post, I'll outline some of our experimental results.

Acknowledgements:
I am hugely grateful to Christian Blum who brilliantly implemented the architecture outlined here, and conducted experimental work. Christian was supported by Dr Wenguo Liu, with his deep knowledge of the e-puck, and our experimental infrastructure.

Related blog posts:

Saturday, July 19, 2014

Estimating the energy cost of evolution

Want to create human-equivalent AI? Well, broadly speaking, there are 3 approaches open to you: design it, reverse-engineer it or evolve it. The third of these - artificial evolution - is attractive because it sidesteps the troublesome problem of having to understand how human intelligence works. It's a black box approach: create the initial conditions then let the blind watchmaker of artificial evolution do the heavy lifting. This approach has some traction. For instance David Chalmers, in his philosophical analysis of the technological singularity, writes "if we produce an AI by artificial evolution, it is likely that soon after we will be able to improve the evolutionary algorithm and extend the evolutionary process, leading to AI+". And since we can already produce simple AI by artificial evolution, then all that's needed is to 'improve the evolutionary algorithm'. Hmm. If only it were that straightforward.

About six months ago I asked myself (and anyone else who would listen): ok, but even if we had the right algorithm, what would be the energy cost of artificially evolving human-equivalent AI? My hunch was that the energy cost would be colossal; so great perhaps as to rule out the evolutionary approach altogether. That thinking, and some research, resulted in me submitting a paper to ALIFE 14. Here is the abstract:
This short discussion paper sets out to explore the question: what is the energy cost of evolving complex artificial life? The paper takes an unconventional approach by first estimating the energy cost of natural evolution and, in particular, the species Homo Sapiens Sapiens. The paper argues that such an estimate has value because it forces us to think about the energy costs of co-evolution, and hence the energy costs of evolving complexity. Furthermore, an analysis of the real energy costs of evolving virtual creatures in a virtual environment, leads the paper to suggest an artificial life equivalent of Kleiber's law - relating neural and synaptic complexity (instead of mass) to computational energy cost (instead of real energy consumption). An underlying motivation for this paper is to counter the view that artificial evolution will facilitate the technological singularity, by arguing that the energy costs are likely to be prohibitively high. The paper concludes by arguing that the huge energy cost is not the only problem. In addition we will require a new approach to artificial evolution in which we construct complex scaffolds of co-evolving artificial creatures and ecosystems.
The full proceedings of ALIFE 14 have now been published online, and my paper Estimating the Energy Cost of (Artificial) Evolution can be downloaded here.

And here's a very short (30 second) video introduction on YouTube:


My conclusion? Well I reckon that the computational energy cost of simulating and fitness testing something with an artificial neural and synaptic complexity equivalent to humans could be around 10^14 KJ, or 0.1 EJ. But evolution requires many generations and many individuals per generation, and - as I argue in the paper - many co-evolving artificial species. Also taking account of the fact that many evolutionary runs will fail (to produce smart AI), the whole process would almost certainly need to be re-run from scratch many times over. If multiplying those population sizes, generations, species and re-runs gives us (very optimistically) a factor of 1,000,000 - then the total energy cost would be 100,000 EJ. In 2010 total human energy use was about 539 EJ. So, artificially evolving human-equivalent AI would need the whole human energy generation output for about 200 years.


The full paper reference:

Winfield AFT, Estimating the Energy Cost of (Artificial) Evolution, pp 872-875 in Proceedings of the Fourteenth International Conference on the Synthesis and Simulation of Living Systems, Eds. H Sayama, J Rieffel, S Risi, R Doursat and H Lipson, MIT Press, 2014.

Related blog posts:

Saturday, June 28, 2014

Your robot doggie could really be pleased to see you

There have been several stories in the last few weeks about emotional robots; robots that feel. Some are suggesting that this is the next big thing in robotics. It's something I wrote about in this blog post seven years ago: could a robot have feelings?

My position on this question has always been pretty straightforward. It's easy to make robots that behave as if they have feelings, but quite a different matter to make robots that really have feelings. 

But now I'm not so sure. There are I think two major problems with this apparently clear distinction between as if and really have.

The first is what do we mean by really have feelings. I'm reminded that I once said to a radio interviewer who asked me if a robot have feelings: if you can tell me what feelings are, I'll tell you whether a robot can have them or not. Our instinct (feeling even) is that feelings are something to do with hormones, the messy and complicated chemistry that too often seems to get in the way of our lives. Thinking, on the other hand, we feel to be quite different; the cool clean process of neurons firing, brains working smoothly. Like computers. Of course this instinct, this dualism, is quite wrong. We now know, for instance, that damage to the emotional centre of the brain can lead to an inability to make decisions. This false dualism has led I think to the trope of the cold, calculating unfeeling robot.

I think there is also some unhelpful biological essentialism at work here. We prefer it to be true that only biological things can have feelings. But which biological things? Single celled organisms? No, they don't have feelings. Why not? Because they are too simple. Ah, so only complex biological things have feelings. Ok, what about sharks or crocodiles; they're complex biological things; do they have feelings? Well, basic feelings like hunger, but not sophisticated feelings, like love or regret. Ah, mammals then. But which ones? Well elephants seem to mourn their dead. And dogs of course. They have a rich spectrum of emotions. Ok, but how do we know? Well because of the way they behave; your dog behaves as if he's pleased to see you because he really is pleased to see you. And of course they have the same body chemistry as us, and since our feelings are real* so must theirs be.

And this brings me to the second problem. The question of as if. I've written before that when we (roboticists) talk about a robot being intelligent, what we mean is a robot that behaves as if it is intelligent. In other words an intelligent robot is not really intelligent, it is an imitation of intelligence. But for a moment let's not think about artificial intelligence, but artificial flight. Aircraft are, in some sense, an imitation of bird flight. And some recent flapping wing flying robots are clearly a better imitation - a higher fidelity simulation - than fixed-wing aircraft. But it would be absurd to argue that an aircraft, or a flapping wing robot, is not really flying. So how do we escape this logical fix? It's simple. We just have to accept that an artefact, in this case an aircraft or flying robot, is both an emulation of bird flight and really flying. In other words an artificial thing can be both behaving as if it has some property of natural systems and really demonstrating that property. A robot can be behaving as if it is intelligent and - at the same time - really be intelligent. Are there categories of properties for which this would not be true? Like feelings..? I used to think so, but I've changed my mind.

I'm now convinced that we could, eventually, build a robot that has feelings. But not by simply programming behaviours so that the robot behaves as if it has feelings. Or by having to invent some exotic chemistry that emulates bio-chemical hormonal systems. I think the key is robots with self-models. Robots that have simulations of themselves inside themselves. If a robot is capable of reasoning about the consequences of it's, or other's actions, on itself, then it seems to me it could demonstrate something like regret (about being switched off, for instance). A robot with a self-model has the computational machinery to also model the consequences of actions on conspecifics - other robots. It would have an artificial Theory of Mind and that, I think, is a prerequisite for empathy. Importantly we would also program the robot to model heterospecifics, in particular humans, because we absolutely require empathic robots to be empathic towards humans (and, I would argue, animals in general).

So, how would this robot have feelings? It would, I believe, have feelings by virtue of being able to reason about the consequences of actions, both its own and others' actions, on itself and others. This reasoning would lead to it making decisions about how to act, and behave, which would demonstrate feelings, like regret, guilt, pleasure or even love, with an authenticity which would make it impossible to argue that it doesn't really have feelings.

So your robot doggie could really be pleased to see you.

*except when they're not.

Thursday, June 26, 2014

The Next Big Things in Robotics

Last week I attended the launch event for a new NESTA publication called Our work here is done: Visions of a Robot Economy. It was an interesting event, and not at all what I was expecting. In fact I didn't know what to expect. Even though I contributed a chapter to the book I had no idea, until last week, who else had written for it - or the scope of those contributions and the book as a whole. I was very pleasantly surprised. Firstly because it was great to find myself in such good company: economists, philosophers, historians, (ex-) financiers and all round deep thinkers. And second because the volume faces up to some of the difficult societal questions raised by second wave robotics.

The panel discussion was excellent, and the response by economist Carlota Perez was engaging and thought provoking - check here for the Storified tweets and pictures. Perhaps the thing that surprised me the most, given the serious economists on the panel (FT, The Economist) was that the panel ended up agreeing that the Robot Economy will necessitate something like a Living Wage. Music to this socialist's ears.

In my contribution: The Next Big Things in Robotics (pages 38-44) I do a bit of near-future gazing and suggest four aspects of robotics that will, I think, be huge. They are:
  • Wearable Robotics
  • Immersive Teleoperated Robots
  • Driverless Cars
  • Soft Robotics
To see why I chose these - and to read the other great articles - please download the book. Let me know if you disagree with my choice, or to suggest other Next Big Things in robotics. I end my chapter with a section called What's not coming soon: super intelligent robots
"My predicted things that will be really big in robotics don't need to be super intelligent. Wearable robots will need advanced adaptive (and very safe and reliable) control systems, as well as advanced neural–electronics interfaces, and these are coming. But ultimately it’s the human wearing the robot who is in charge. The same is true for teleoperated robots: again, greater low–level intelligence is needed, so that the robot can operate autonomously some of the time but ask for help when it can’t figure out what to do next. But the high–level intelligence remains with the human operator and – with advanced immersive interfaces as I have suggested – human and robot work together seamlessly. The most autonomous of the next big things in robotics is the driverless car, but again the car doesn't need to be very smart. You don't need to debate philosophy with your car – just trust it to take you safely from A to B."


Related blog posts:
New Robotics and New Opportunities
Soft Robotics in Space
Google robot car: Great but proving the AI is safe is the real challenge
Why robots will not be smarter than humans by 2029

Friday, February 28, 2014

Why robots will not be smarter than humans by 2029

In the last few days we've seen a spate of headlines like 2029: the year when robots will have the power to outsmart their makers, all occasioned by an Observer interview with Google's newest director of engineering Ray Kurzweil.

Much as I respect Kurzweil's achievements as an inventor, I think he is profoundly wrong. Of course I can understand why he would like it to be so - he would like to live long enough to see this particular prediction come to pass. But optimism doesn't make for sound predictions. Here are several reasons that robots will not be smarter than humans by 2029.

  • What exactly does as-smart-as-humans mean? Intelligence is very hard to pin down. One thing we do know about intelligence is that it is not one thing that humans or animals have more or less of. Humans have several different kinds of intelligence - all of which combine to make us human. Analytical or logical intelligence of course - the sort that makes you good at IQ tests. But emotional intelligence is just as important, especially (and oddly) for decision making. So is social intelligence - the ability to intuit others' beliefs, and to empathise. 
  • Human intelligence is embodied. As Rolf Pfeifer and Josh Bongard explain in their outstanding book you can't have one without the other. The old Cartesian dualism - the dogma that robot bodies (the hardware) and mind (the software) are distinct and separable - is wrong and deeply unhelpful. We now understand that the hardware and software have to be co-designed. But we really don't understand how to do this - none of our engineering paradigms fit. A whole new approach needs to be invented.
  • As-smart-as-humans probably doesn't mean as-smart-as newborn babies, or even two year old infants. They probably mean somehow-comparable-in-intelligence-to adult humans. But an awful lot happens between birth and adulthood. And the Kurzweilians probably also mean as-smart-as-well-educated-humans. But of course this requires both development - a lot of which somehow happens automatically - and a great deal of nurture. Again we are only just beginning to understand the problem, and developmental robotics - if you'll forgive the pun - is still in its infancy.
  • Moore's Law will not help. Building human-equivalent robot intelligence needs far more than just lots of computing power. It will certainly need computing power, but that's not all. It's like saying that all you need to build a cathedral is loads of marble. You certainly do need large quantities of marble - the raw material - but without (at least) two other things: the design for a cathedral, and/or the knowhow of how to realise that design - there will be no cathedral. The same is true for human-equivalent robot intelligence. 
  • The hard problem of learning and the even harder problem of consciousness. (I'll concede that a robot as smart as a human doesn't have to be conscious - a philosophers-zombie-bot would do just fine.) But the human ability to learn, then generalise that learning and apply it to completely different problems is fundamental, and remains an elusive goal for robotics and AI. In general this is called Artificial General Intelligence, which remains as controversial as it is unsolved.

These are the reasons I can be confident in asserting that robots will not be smarter than humans within 15 years. It's not just that building robots as smart as humans is a very hard problem. We have only recently started to understand how hard it is well enough to know that whole new theories (of intelligence, emergence, embodied cognition and development, for instance) will be needed, as well as new engineering paradigms. Even if we had solved these problems and a present day Noonian Soong had already built a robot with the potential for human equivalent intelligence - it still might not have enough time to develop adult-equivalent intelligence by 2029.

That thought leads me to another reason that it's unlikely to happen so soon. There is - to the best of my knowledge - no very-large-scale multidisciplinary research project addressing, in a coordinated way, all of the difficult problems I have outlined here. The irony is that there might have been. The project was called Robot Companions, it made it to the EU FET 10-year Flagship project shortlist but was not funded.


Search Results

Saturday, February 22, 2014

What does it mean to have giants like Google, Apple and Amazon investing in robotics?

This was the latest question posed to the Robotics by Invitation panel on Robohub. Here, reposted, is my answer.

Judging by the levels of media coverage and frenzied speculation that has followed each acquisition, the short answer to what does it mean is: endless press exposure. I almost wrote ‘priceless exposure’ but then these are companies with very deep pockets; nevertheless the advertising value equivalent must be very high indeed. The coverage really illustrates the fact that these companies have achieved celebrity status. They are the Justin Beibers of the corporate world. Whatever they do, whether it is truly significant or not, is met with punditry and analysis about what it means. A good example is Google’s recent acquisition of British company DeepMind. In other words: large AI Company buys small AI Company. Large companies buy small companies all the time but mostly they don’t make prime time news. It’s the Beiberisation of the corporate world.
But the question is about robotics, and to address it in more detail I think we need to think about the giants separately.
Take Amazon. We think of Amazon as an Internet company, but the web is just its shop window. Behind that shop window is a huge logistics operation with giant warehouses – Amazon’s distribution centres, so no one should be at all surprised by their acquisition of brilliant warehouse automation company Kiva Systems. Amazon’s recent stunt with the ‘delivery drone’ was I think just that – a stunt. Great press. But I wouldn’t be at all surprised to see more acquisitions toward further automation of Amazon’s distribution and delivery chain.
Apple is equally unsurprising. They are a manufacturing company with a justifiable reputation for super high quality products. As an electronics engineer who started his career by taking wirelesses and gramophones apart as a boy, I’m fascinated by the tear-downs that invariably follow each new product release. It’s obvious that each new generation of Apple devices is harder to manufacture than the last. Precision products need precision manufacture and it is no surprise that Apple is investing heavily in the machines needed to make its products.
Google is perhaps the least obvious candidate to invest in robotics. You could of course take the view that a company with more money than God can make whatever acquisitions it likes without needing a reason – that these are vanity acquisitions. But I don’t think that’s the case. I think Google has its eyes on the long term. It is an Internet company and the undisputed ruler of the Internet of Information. But computers are no longer the only things connected to the Internet. Real world devices are increasingly networked – the so-called Internet of Things. I think Google doesn’t want to be usurped by a new super company that emerges as the Google of real-world stuff. It’s not quite sure how the transition to the future Internet of Everything will pan out, but figures that mobile robots – as well as smart environments– will feature heavily in that future. I think Google is right. I think it’s buying into robotics because it wants to be a leader and shape the future of the Internet of Everything.


Do please read the other panelists' answers - all interesting, and different! 

Saturday, December 07, 2013

Soft Robotics in Space

Space robotics is understandably conservative. When the cost of putting a robot on a planet, moon or asteroid runs into billions we need to be sure the technology will work. And with very long project lifetimes - spanning decades from engineering design to on-planet robot exploration - it's a long hard road from the research lab to the real off-world use for new advances in robotics.

This context was very much in mind when I gave a talk on Advanced Robotics for Space at the Appleton Space Conference last week. I used this great opportunity to outline a few examples of new research directions in robotics for the European space community, and suggest how these could benefit future planetary robots. I had just 20 minutes, so I couldn't do much more than show a few video clips. The four new directions I highlighted are:
  1. Soft Robotics: soft actuation and soft sensing
  2. Robots with Internal Models, for self-repair
  3. Self-assembling swarm robots, for adaptive/evolvable morphology
  4. Autonomous 3D collective robot construction
In this post I want to talk about just the first of these: soft robotics, and why I think we should seriously think about soft robotics in space. Soft robotics - as the name implies - is concerned with making robots soft and compliant. It's a new discipline which already has its own journal, but not yet a wikipedia page. Soft robots would be soft on the inside as well as the outside - so even the fur covered Paro robot is not a Soft robot. Soft robotics research is about developing new soft, smart materials for both actuation and sensing (ideally within the same material). Soft robots would have the huge advantage over conventional stiff metal and plastic robots, of being light and, well, soft. For robots designed to interact with humans that's obviously a huge advantage because it makes the robot intrinsically much safer. 

Soft robotics research is still at the exploratory stage, so there are not yet prefered materials and approaches. In our lab we are exploring several avenues, one is electroactive polymers (EAPs) for artificial muscles; another is the bio-mimetic 3D printed flexible artificial whisker. Another approach makes use of shape memory alloys to actuate octopus like limbs: here is a very nice YouTube movie from the EU OCTOPUS project. And perhaps one of the most unlikely but very promising approaches: exploiting fluid-solid phase changes in ground coffee to make a soft gripper: the Jaeger-Lipson coffee balloon gripper.

Let me elaborate a little more on the coffee balloon gripper. Based on the simple observation that when you buy vacuum-packed ground coffee the pack is completely solid, yet as soon as you cut open the pack and release the vacuum the ground coffee returns to its flowing fluid state. Heinrich Jaeger, Hod Lipson and co-workers put ground coffee into a latex balloon then, by controlling the vacuum via a pump, they demonstrate a gripper able to safely pick up and hold more or less any object. Here is a YouTube video showing this remarkable ability.

Almost any planetary exploration robot is likely to need a gripper to pick up or collect rock samples for analysis or collection (for return to Earth). Conventional robot grippers are complex mechanical devices that need very precise control in order to reliably pick up irregularly shaped and sized objects. That control is mechanically and computationally expensive, and problematical because of time delays if it has to be performed remotely from Earth. Something like the Jaeger-Lipson coffee balloon gripper would - I think - provide a much better solution. This soft gripper avoids the hard control and computation because the soft material adapts itself to the thing it is gripping; it's a great example of what we call morphological computation.

The second example I suggested is inspired by work in our lab on bio-inspired touch sensing. Colleagues have developed a device called TACTIP - a soft flexible touch sensor which provides robots (or robot fingers) with very sensitive touch sensing capable of sensing both shape and texture. Importantly the sensing is done inside TACTIP, so the outside surface of the sensor can sustain damage without loss of sensing. Here is a very nice YouTube report on the TACTIP project

It's easy to see that giving planetary robots touch sensing could be useful, but there's another possibility I outlined: the potential to allow Earth scientists to feel what the robot's sensor is feeling. PhD student Callum Roke and his co-workers developed a system based on TACTIP for what we call remote tele-haptics. Here is a video clip demonstrating the idea:



Imagine being able to run your fingers across the surface of Mars, or directly feel the texture of a piece of asteroid rock without actually being there.

Tuesday, November 26, 2013

Noisy imitation speeds up group learning

Broadly speaking there are two kinds of learning: individual learning and social learning. Individual learning means learning something entirely on your own, without reference to anyone else who might have learned the same thing before. The flip side of individual learning is social learning, which means learning from someone else. We humans are pretty good at both individual and social learning although we very rarely have to truly work something out from first principles. Most of what we learn, we learn from teachers, parents, grandparents and countless others. We learn everything from how to make chicken soup to flying an aeroplane from watching others who already know the recipe (or wrote it down), or have mastered the skill. For modern humans I reckon it’s pretty hard to think of anything we have truly learned, on our own; maybe learning to control our own bodies as babies, leading to crawling and walking are candidates for individual learning (although as babies we are surrounded by others who already know how to walk – would we walk at all if everyone else got around on all fours?). Learning to ride a bicycle is perhaps also one of those things no-one can really teach you – although it would be interesting to compare someone who has never seen a bicycle, or anyone riding one, in their lives with those (most of us) who see others riding bicycles long before climbing on one ourselves.

In robotics we are very interested in both kinds of learning, and methods for programming robots that can learn are well known. A method for individual learning is called reinforcement learning (RL). It’s a laborious process in which the robot tries out lots and lots of actions and gets feedback on whether each action helps or hinders the robot in getting closer to its goal – actions that help/hinder are re/de-inforced so the robot is more/less likely to try them again; it’s a bit like shouting “warm, hot, cold, colder…” in a hide-and-seek game. It’s fair to say that RL in robotics is pretty slow; robots are not good individual learners, but that's because, in general, they have no prior knowledge. As a fair comparison think of how long it would take you to learn how to make fire from first principles if you had no idea that getting something hot may, if you have the right materials and are persistent, create fire, or that rubbing things together can make them hot. Roboticists are also very interested in developing robots that can learning socially, especially by imitation. Robots that you can program by showing them what to do (called programming by demonstration) clearly have a big advantage over robots that have to be explicitly programmed for each new skill.

Within the artificial culture project PhD student (now Dr) Mehmet Erbas developed a new way of combining social learning by imitation and individual reinforcement learning, and the paper setting out the method together with results from simulation and real robots has been published in the journal Adaptive Behavior. Let me explain the experiments with real robots, and what we have learned from them.

Here's our experiment. We have two robots - called e-pucks. The inset shows a closeup. Each robot has its own compartment and must - using individual (reinforcement) learning - learn how to navigate from the top right hand corner, to the bottom left hand corner of its compartment. Learning this way is slow, taking hours. But in this experiment the robots also have the ability to learn socially, by watching each other. Every so often one of the robots will stop its individual learning and drive itself out of its own compartment, to the small opening at the bottom left of the other compartment. There it will stop and simply watch the other robot while it is learning, for a few minutes. Using a movement imitation algorithm the watching robot will (socially) learn a fragment of what the other robot is doing, then combine this knowledge into what it is individually learning. The robot then runs back to its own compartment and resumes its individual learning. We call the combination of social and individual learning 'imitation enhanced learning'.

In order to test the effectiveness of our new imitation enhanced learning algorithm we first run the experiment with the imitation turned off, so the robots learn only individually. This gives us a baseline for comparison. We then run two experiments with imitation enhanced learning. In the first we wait until one robot has completed its individual learning, so it is an 'expert'; the other robot then learns - using its combination of individual learning and social learning from the expert. Not surprisingly learning this way is faster.

This graph shows individual learning only as the solid black line, and imitation-enhanced learning from an expert as the dashed line. In both cases learning is more or less complete when the graphs transition from vertical to horizontal. We see that individual learning takes around 360 minutes (6 hours). With the benefit of an expert to watch, learning time drops to around 60 minutes.




The second experiment is even more interesting. Here we start the two robots at the same time, so that both are equally inexpert. Now you might think it wouldn't help at all, but remarkably each robot learns faster when it can observe, from time to time, the other inexpert robot, than when learning entirely on its own. As the graph below shows, the speedup isn't as dramatic - but imitation enhanced learning is still faster.

Think of it this way. It's like two novice cooks, neither of whom knows how to make chicken soup. Each is trying to figure it out by trial and error but, from time to time, they can watch each other. Even though its pretty likely that each will copy some things that lead to worse chicken soup, on average and over time, each hapless cook will learn how to make chicken soup a bit faster than if they were learning entirely alone.



In the paper we analyse what's going on when one robot imitates part of the semi-learned sequence of moves by the other. And here we see something completely unexpected. Because the robots imitate each other imperfectly - when one robot watches another and then tries to copy what it saw, the copy will not be perfect - from time to time, one inexpert robot will miscopy the other inexpert robot and the miscopy, by chance, helps it to learn. To use the chicken soup analogy: it's as if you are spying on the other cook - try to copy what they're doing but get in wrong and, by accident, end up with better chicken soup.

This is deeply interesting because it suggests that when we learn in groups making mistakes - noisy social learning - can actually speed up learning for each individual and for the group as a whole.

Full reference:
Mehmet D Erbas, Alan FT Winfield, and Larry Bull (2013), Embodied imitation-enhanced reinforcement learning in multi-agent systems, Adaptive Behavior. Published online 29 August 2013. Download pdf (final draft)

Wednesday, October 30, 2013

Ethical Robots: some technical and ethical challenges

Here are the slides of my keynote at last week's excellent EUCog meeting: Social and Ethical Aspects of Cognitive Systems. And the talk itself is here, on YouTube.

I've been talking about robot ethics for several years now, but that's mostly been about how we roboticists must be responsible and mindful of the societal impact of our creations. Two years ago I wrote - in my Very Short Introduction to Robotics - that robots cannot be ethical. Since then I've completely changed my mind*. I now think there is a way of making a robot that is at least minimally ethical. It's a huge technical challenge which, in turn, raises new ethical questions. For instance: if we can build ethical robots, should we? Must we..? Would we have an ethical duty to do so? After all, the alternative would be to build amoral robots. Or, would building ethical robots create a new set of ethical problems? An ethical Pandora's box.




The talk was in three parts.

Part 1: here I outline why and how roboticists must be ethical. This is essentially a recap of previous talks. I start with the societal context: the frustrating reality that even when we meet to discuss robot ethics this can be misinterpreted as scientists fear a revolt of killer robots. This kind of media reaction is just one part of three linked expectation gaps, in what I characterise as a crisis of expectations. I then outline a few ethical problems in robotics - just as examples. Here I argue it's important to link safe and ethical behaviour - something that I return to later. Then I recap the five draft principles of robotics.

Part 2: here I ask the question: what if we could make ethical robots? I outline new thinking which brings together the idea of robots with internal models, with Dennett's Tower of Generate and Test, as a way of making robots that can predict the consequences of their own actions. I then outline a generic control architecture for robot safety, even in unpredictable environments. The important thing about this approach is that the robot can generate next possible actions, test them in its internal model, and evaluate the safety consequences of each possible action. The unsafe actions are then inhibited - and the robot controller determines which of the remaining safe actions is chosen, using its usual action-selection mechanism. Then I argue that it is surprisingly easy to extend this architecture for ethical behaviour, to allow the robot to predict the robot actions that would minimise harm for a human in its environment. This appears to represent an implementation of Asimov's 1st and 3rd laws. I outline the significant technical challenges that would need to be overcome to make this work.

But, assuming such a robot could be built, how ethical would it be? I suggest that with a subset of Asimovian ethics it probably wouldn't satisfy an ethicist or moral philosopher. But, nevertheless - I argue there's a good chance that such a minimally ethical robot could help to increase trust, in the robot, from its users.

Part 3: in the final part of the talk I conclude with some ethical questions. The first is: if we could build an ethical robot, are we ethically compelled to do so? Some argue that we have an ethical duty to try and build moral machines. I agree. But the counter argument, my second ethical question, is are there ethical hazards? Are we opening a kind of ethical Pandora's box, by building robots that might have an implicit claim to rights, or responsibilities. I don't mean that such a robot would ask for rights, but instead that, because it is has some moral agency, then we might think it should be accorded rights. I conclude that we should try and build ethical robots. The benefits I think far outweigh any ethical hazards, which in any event can, I think, be minimised.


*It was not so much an epiphany, as a slow conversion from sceptic to believer. I have long term collaborator Michael Fisher to thank for doggedly arguing with me that it was worth thinking deeply about how to build ethical robots.

Sunday, October 20, 2013

A Close(ish) Encounter with Voyager 2

It is summer 1985. I'm visiting Caltech with colleague and PhD supervisor Rod Goodman. Rod has just been appointed in the Electrical Engineering Department at Caltech, and I'm still on a high from finishing my PhD in Information Theory. Exciting times.

Rod and I are invited to visit the Jet Propulsion Labs (JPL). It's my second visit to JPL. But it turned into probably the most inspirational afternoon of my life. Let me explain.

After the tour the good folks who were showing us round asked if I would like to meet some of the post-docs in the lab. As he put it: the fancy control room with the big wall screens is really for the senators and congressmen - this is where the real work gets done. So, while Rod went off to discuss stuff with his new Faculty colleagues I spent a couple of hours in a back room lab, with a Caltech post-doc working on - as he put it - a summer project. I'm ashamed to say I don't recall his name so I'll call him Josh. Very nice guy, a real southern californian dude.

Now, at this point, I should explain that there was a real buzz at JPL. Voyager 2, which had already more than met its mission objectives was now on course to Uranus and due to arrive in January 1986. It was clear that there was a significant amount of work in planning for that event. The first ever opportunity to take a close look at the seventh planet.

So, Josh is sitting at a bench and in front of him is a well-used Apple II computer. And behind the Apple II is a small display screen so old that the phosphor is burned. This used to happen with CRT computer screens - it's the reason screen savers were invented. Beside the computer are notebooks and manuals, including prominently a piece of graph paper with a half-completed plot. Josh then starts to explain: one of the cameras on Voyager 2 has (they think) a tiny piece of grit* in the camera turntable - the mechanism that allows the camera to be panned. This space grit means that the turntable is not moving as freely as it should. It's obviously extremely important that when Voyager gets to Uranus they need to be able to point the cameras accurately, so Josh's project is to figure out how much torque is (now) needed to move the camera turntable to any desired position. In other word's re-calibrate the camera's controller.

At this point I stop Josh. Let me get this straight: there's a spacecraft further from earth, and flying faster, than any manmade object ever, and your summer project is to do experiments with one of its cameras, using your Apple II computer. Josh: yea, that's right.

Josh then explains the process. He constructs a data packet on his Apple II, containing the control commands to address the camera's turntable motor and to instruct the motor to drive the turntable. As soon as he's happy that the data packet is correct, he then sends it - via the RS232 connection at the back of his Apple II - to a JPL computer (which, I guess would be a mainframe). That computer then, in turn, puts Josh's data packet together with others, from other engineers and scientists also working on Voyager 2, after - I assume - carefully validating the correctness of these commands. Then the composite data packet is sent to the Deep Space Network (DSN) to be transmitted, via one of the DSNs big radio telescopes, to Voyager 2.

Then, some time later, the same data packet is received by Voyager 2, decoded and de-constructed and said camera turntable moves a little bit. The camera then sends back to Earth, again via a composite data packet, some feedback from the camera - the number of degrees the turntable moved. So a day or two later, via a mind-bogglingly complex process involving several radio telescopes and some very heavy duty error-correcting codes, the camera-turntable feedback arrives back at Josh's desktop Apple II with the burned-phosphor screen. This is where the graph paper comes in. Josh picks up his pencil and plots another point on his camera-turntable calibration graph. He then repeats the process until the graph is complete. It clearly worked because six months later Voyager 2 produced remarkable images of Uranus and its moons.

This was, without doubt, the most fantastic lab experiment I'd ever seen. From his humble Apple II in Pasadena Josh was doing tests on a camera rig, on a spacecraft, about 1.7 billion miles away. For a Thunderbirds kid, I really was living in the future. And being a space-nerd I already had some idea of the engineering involved in NASA's deep space missions, but that afternoon in 1985 really brought home to me the extraordinary systems engineering that made these missions possible. Given the very long project lifetimes - Voyager 2 was designed in the early 1970s, launched in 1977, and is still returning valuable science today - its engineers had to design for the long haul; missions that would extend over several generations. Systems design like this requires genius, farsightedness and technical risk taking. Engineering that still inspires me today.

*it later transpired that the problem was depleted lubricant, not space grit.