New Face

Cambridge computer scientists are building computers that read minds – and robots and avatars that express emotion.

Our innovation has been to go beyond building machines that simply recognise basic emotions to ones that recognise complex mental states.

Professor Peter Robinson

When humans talk to each other they communicate in far greater ways than simply speaking. The way that they speak matters – their facial expressions, tone of voice and body language. Without these added cues, it’s much more difficult to communicate in speech or writing, as anyone who has experienced misunderstandings by email knows full well.

Imagine, then, the challenge of communicating properly with a computer or a machine. Such human–computer interaction (HCI) is widely regarded as fundamental to the 21st century, and is predicted to change the face of technology in our homes and vehicles, in education, in manufacturing, and in settings as diverse as care homes and nuclear reactor control rooms.

‘For HCI to live up to expectations,’ explains Professor Peter Robinson, ‘intelligent machines need to understand humans and the context in which they are communicating and then respond to them in a meaningful way.’ His team at the Computer Laboratory is building systems that can infer human feelings by looking at facial expressions, analysing pitch and tone of voice, and assessing body language and posture. And the team is also building computer avatars and physical robots that can recognise and express emotions.

Mind-reading machines

Most computers are ‘mind-blind’. They are unaware of what the user is thinking and unable to respond to a change in the user’s emotional state – witness the insistent demands of a vehicle navigation system to perform a U-turn, oblivious to the rising exasperation and confusion of the driver.

Although humans notice the mental states of others and use these cues to modify their own actions, a process known as the ‘theory of mind’, this ability is not shared by everyone. In fact, one characteristic of autism spectrum conditions is a profound difficulty in interpreting the feelings and emotions of others from non-verbal cues such as facial expressions.

What Professor Robinson’s group has accomplished is to engineer computers that can read minds, giving them the ability to extract, analyse and make sense of facial information. The team has drawn on recent work led by Professor Simon Baron-Cohen, Director of Cambridge’s Autism Research Centre, who has devised a detailed classification of 412 finely distinguished mental states and produced a library of 2,500 video clips of them being performed by actors. This library was part of a computer-based guide to help individuals with autism, and the Computer Laboratory team has used it to train their computer systems.

Armed with the library, and using a digital video camera, the computer tracks 24 feature points on the face, analysing in real time facial expression, head movement, shape and colour. To infer what this means, the system uses Bayesian algorithms and machine learning to work out the probability that, for example, a combination such as a head nod, a smile and raised eyebrows might mean interest. Amazingly, the overall accuracy of the computer is over 75% when analysing actors and over 60% for non-actors, which places it among the top 5% of human observers.

Complex emotions

‘HCI is a growing research area,’ explains Professor Robinson. ‘Our innovation has been to go beyond building machines that simply recognise basic emotions to ones that recognise complex mental states.’

The face expresses basic emotions like fear, anger, disgust and surprise so clearly that they can be recognised in a static photograph. Other mental states, such as the lack (or dawning) of understanding and confusion, are too complex to capture in a photograph because they take place over several seconds or appear as a shifting combination of movements.

It is precisely these complex emotions that Ian Davies, one of six research students in the team, is capturing through physiological measurements and eye-tracking for the types of applications he is working on: command and control systems, such as those used by the emergency services or in power stations. Here, being able to identify when an individual is overloaded or confused could aid both safety and efficiency. There are even benefits for more common tasks like driving, as he explains: ‘If the car’s system could recognise that the driver is confused, it could avoid overloading them with additional information – perhaps turning down the radio or simplifying the navigation instructions.’

A common problem for facial analysis systems is the tendency of people to pass their hands across their faces. Often treated as unwanted ‘noise’ by such systems, hand-to-face gestures are in fact an important source of information – people might hold their chin, for instance, when concentrating, or cover their mouth when shocked. Marwa Mahmoud is looking at ways of mapping the meaning of these gestures, adding this information to a multimodal analysis of facial expression. Likewise, Ntombi Banda is building multimodal systems that combine facial analysis, tone of voice and body movements to improve recognition accuracy.

Robots and avatars

Speaking at a conference recently, Microsoft’s Chairman Bill Gates predicted that the next big thing in technology would be robotics. Imagine, for instance, how useful it would be to have a robot strong enough to accomplish heavy tasks in the home – like lifting patients requiring assisted care at home – and yet capable of understanding what a human is feeling. Or imagine an avatar-based teaching aid that is sensitive enough to pick up that the lesson is going too fast and adapt accordingly.

‘It’s important that robots or avatars express the right thing at the right time,’ says Professor Robinson. ‘They not only need to recognise nonverbal behaviour by sensing accurately what humans are expressing but also need to generate such expressions themselves.’

Alyx, a computer-generated avatar from Valve Software’s game Half-Life 2, recognises and responds to happiness, surprise, confusion, interest and boredom. She has been ‘trained’ by Tadas Baltru?aitis, using examples of human facial expressions to make the avatar’s emotions instantly recognisable. Alyx’s emotions, he explains, have been especially chosen: ‘This range fits well with applications in remote communications, such as call centres, Internet shopping or online teaching, where the service needs to adapt to the feelings of the user.’

Charles, on the other hand, is a robotic head made specifically for the team by Hanson Robotics. With cameras in his eyes to monitor facial expression, and 24 motors in his skull that pull his pliable silicon-based ‘skin’ into expressions in response, he is capable of showing a remarkable range of expressions. Laurel Riek has been testing how Charles might be used to train young doctors: ‘Using data collected from real patients, Charles can realistically simulate movement disorders that manifest themselves in the face, such as cerebral palsy and dystonia. We are hoping such a realistic simulator will allow student clinicians to practice their communication and diagnostic skills.’ She predicts that one day robots like Charles might also be used for patient rehabilitation, for instance helping to teach and motivate stroke patients to re-master their facial muscles.

Charles, who was trained using expression data provided by the Autism Research Centre, is now being prepared by Andra Adams as an instructional tool to help individuals with autism spectrum conditions. ‘Children with this condition have difficulty with the nuances of social interaction. Charles can help them practise turn-taking in conversation, holding eye gaze and recognising emotions from facial expressions.’

Working at the very frontiers of HCI research, Peter Robinson’s group combines expertise in psychology, computer vision, signal processing and machine learning, as well as building and evaluating complex computer systems. As he explains: ‘Many of the most interesting challenges in HCI lie at the boundaries between disciplines.’

See a Cambridge Ideas video about this research at http://www.youtube.com/watch?v=whCJ4NLUSB8&feature=related

For more information, please contact Professor Peter Robinson (pr10@cam.ac.uk) at the Computer Laboratory (www.cl.cam.ac.uk/emotions/).


This work is licensed under a Creative Commons Licence. If you use this content on your site please link back to this page.