Who do YOU think wins? IBM's 'unsettling' AI supercomputer argues convincingly against humans in landmark debates between man and machine

  • IBM Debater presented better arguments than a professional human debater
  • The AI-powered system is designed to address specific point from its opponent
  • Debater could be used to help make decisions in healthcare and criminal justice

IBM's new AI supercomputer has been shown to take on humans in a debate and, according to its creators, 'win'.

Named Project Debater, the 'unsettling' AI was pitted against two humans in the first public demonstration of its ability at an event in San Francisco.

It competed in what is known as computational argumentation, where debaters know a subject, present a position and defend it against opposition.

The computer delivered its opening argument by pulling in evidence from its huge database, mainly made up from news articles and journals.

It then listened to a professional human debater’s counter-argument and spent four minutes rebutting it.  

Although the crowd agreed the human beings were unmatched in delivery, it decided the machine made better arguments with more substance.

However many of its responses were plagiarised with full sentences lifted from well-known sources, raising the question of how intelligent the AI really is.

Scroll down for video

IBM Project Debater gathers information from millions of newspaper and magazine articles saved in its database to engage in long-form discussion about a given topic

IBM Project Debater gathers information from millions of newspaper and magazine articles saved in its database to engage in long-form discussion about a given topic

IBM's AI system is not connected to the internet, but instead uses a stash of hundreds of millions of newspaper and magazine articles on ‘about 100 areas of knowledge’ which the AI is able to draw upon to build its argument. 

During the live on-stage debate, IBM Project Debater also articulated the word 'voiceover' in the middle of a sentence, suggesting that video transcripts are also being categorised and used in its database of information.

According to IBM Principle Investigator for Project Debator Dr Noam Slonim, the computer is not designed to regurgitate its entire argument from a single source.

Speaking on BBC Radio 4’s Today programme, Dr Slonim said: ‘It is not copying a whole article, it is not even copying a whole paragraph either, it is picking single sentences – and in some cases just clauses – taken from many different articles, then it glues that together into a coherent, persuasive narrative.’ 

However, Project Debater manager Ranit Aharanov admitted that if the phrasing of some of the arguments sounds familiar at all – it is because the AI has lifted an argument wholesale from the source material.

‘A lot of content that you see is actually phrases that are taken from the sources, like newspapers,’ Aharanov told VentureBeat.

‘They do undergo rephrasing of various sorts to make them more coherent, to make them align with each other, to sometimes add information about the person mentioned, or so on, so there is phrasing — but a lot of it is taken as-is.’ 

One example of this appeared during the debate when the AI-powered machine argued that ‘having a space exploration program is a critical part of being a great power’.

This exact phrasing has only appeared once before, when it was used in an article by Wall Street Journal contributor Mark Whittington when writing about Japan’s plan to land people on the moon.

To make sure the resulting debate is compelling to watch, IBM also programmed the AI to add jokes into its narrative to keep the crowd entertained.

During its rebuttal, the machine is also designed to suggest its opponent was lying during their argument.

This is a technique used by humans to distract when they do not have a strong argument to rely upon, something the IBM team wanted to replicate with the machine.

HOW DOES IBM PROJECT DEBATER WORK AND DOES IT PLAGIARISE ITS RESPONSES?

IBM Project Debater was designed to make an engaging and persuasive argument on tough topics where there is often no easy answer.

The team behind the project hopes the Artificial Intelligence (AI) will be able to formulate unbiased, fact-based arguments to help human beings make difficult decisions in a variety of sectors, including healthcare, criminal justice, and hiring and firing in the workplace.

IBM designed its AI to be able to quickly construct an argument — even when it has no prior knowledge of the subject matter.

To do this, Project Debater draws on an immense database.

The machine is not connected to the internet, so everything is saved locally. 

Database of existing arguments

IBM says it has information on ‘about 100 areas of knowledge’, based on ‘hundreds of millions of articles from numerous well-known newspapers and magazines’.

During a recent debate, the AI articulated the word ‘voiceover’ in the middle of a sentence in its argument – suggesting the system is also categorising and using video transcripts as well.

The machine is not designed to lift entire paragraphs from existing articles which support its side of the debate, but instead take small snippets of sentences to construct its own narrative.

However, Project Debater manager Ranit Aharanov has admitted that some of the phrases spoken by the computer could sound very familiar.

Aharanov told VentureBeat: ‘A lot of content that you see is actually phrases that are taken from the sources, like newspapers.

‘They do undergo rephrasing of various sorts to make them more coherent, to make them align with each other, to sometimes add information about the person mentioned, or so on, so there is phrasing — but a lot of it is taken as-is.’

One example of this appeared during the debate when the AI-powered machine argued that ‘having a space exploration program is a critical part of being a great power’.

This exact phrasing has only appeared once before, when it was used in an article by Wall Street Journal contributor Mark Whittington when writing about Japan’s plan to land people on the moon.

Can Project Debater argue for and against a topic?

Yes, Project Debater is able to debate both sides of an argument.

To do this, the system has to determine whether an article in its database is either for or against the statement, and correctly pick those most relevant to the argument it has been asked to make.

The team behind the project believe this ability to argue for and against a topic means the AI machine is less biased than any human debater.

Does it simply regurgitate the facts?

IBM has included a number of features in IBM Project Debater to ensure the resulting debate is compelling to watch for human spectators.

Researchers programmed the AI to add jokes into its narrative to keep the crowd entertained.

When it has a chance at rebuttal, the system is designed to suggest its opponent was lying during their argument in order to create doubt amongst the audience.

This is a technique used by humans to distract when they do not have a strong argument to rely upon, something the IBM team wanted to replicate with the machine.

How does Project Debater categorise information?

IBM has published some of the early databases used to train the AI.

The Armonk-based company used information siphoned from Wikipedia to test the system.

Project Debater would assign a debate topic for the information, like ‘we should ban the sale of violent video games to minors’, which is then narrowed to the main concept in the topic — in this example, ‘the sale of violent video games to minors’.

The system then trawls for information which can be used in its argument.

When looking for evidence to support an argument around violent video games, the system highlighted a study around violent video games mentioned on a Wikipedia entry on aggression.

The extract reads: ‘One study suggested there is a smaller effect of violent video games on aggression than has been found with television violence on aggression.’

Project Debater inserts a "TOPIC_CONCEPT" marker into the most relevant part of the sentence, to help it construct the argument.

The resulting entry in its debate reads: ‘One study suggested there is a smaller effect of TOPIC_CONCEPT on aggression than has been found with television violence on aggression.’

The system also categorises the argument with a 1 if it contains evidence supporting or contesting the topic, and a 0 for non-evidence.

This is dubbed the 'Label'. 

Those marked with a 0 will be discounted by the AI when it tries to create an evidence-based argument on the topic.

Below are a few more examples of the AI finding the core topic of the debate, sourcing arguments from Wikipedia entries, and labelling them as either 1 (to be used either for or against in its narrative) or 0 (non-evidence)

HOW PROJECT DEBATER WAS TRAINED TO CATEGORISE WIKIPEDIA ENTRIES
Topic Core Topic  Argument  Label
We should ban naturopathynaturopathy  Raised in Toronto, Faye now works as a massage therapist in the Winnipeg area, and promotes natural health concerns. 
We should end censorship censorship The PRC (People's Republic of China) has historically sought to use censorship to protect the country's culture. 
We should abolish homework homework According to some studies, parental involvement in homework is beneficial for students. 
We should limit genetic testing genetic testing In December 2003, Willard E. Brown confessed to the 1984 rape and stabbing death of Deborah Sykes after DNA testing linked him to the crime 
We should further exploit nuclear power nuclear power In June 2011, both Ipsos Mori and the Japanese Asahi Shimbun newspaper found drops in support for nuclear power technology in most countries, with support continuing in a number including the US 
We should limit the freedom of speech freedom of speech The First Amendment in the United States Constitution guarantees freedom of speech. 
Advertisement
The AI-powered machine was required to make a four-minute introductory speech, a four-minute rebuttal to the experienced human debater's arguments, and finish with a two-minute closing statement on the topic 

The AI-powered machine was required to make a four-minute introductory speech, a four-minute rebuttal to the experienced human debater's arguments, and finish with a two-minute closing statement on the topic 

During the first showcase of the technology, two debates were presented on-stage – whether government subsidise space exploration and if the use of telemedicine should be increased.

IBM Project Debater was pitted against human debaters Noa Ovadia and Dan Zafrir, who each took the reigns for one of the debates.

In both cases, the audience judged that the machine outperformed in its ability to present a wider body of information in its argument.

However, the human beings were universally judged to be better at delivery in their on-stage speeches.

In the end, IBM Debater could not convince the audience to side with its argument in the debate on whether space exploration should be subsidised.

But the machine won the second debate, convincing more audience members with its arguments on telemedicine usage than its human opponent.

IBM has not taught its AI to judge the reliability of the information it draws upon, something that critics believe could lead to the machine using biased sources to successfully argue its point.

Brhmie Balaram, who works as a senior researcher at the Royal Society for the encouragement of Arts, Manufactures and Commerce (RSA) told the Today programme that there were ‘ethical concerns’ with this approach to an AI debate.

‘It’s about considering what sort of information it is analysing.

‘There’s a lot of people who are concerned the data that is going into these machines might be biased, and therefore, there are ethical concerns. But also, where is it getting this data from? If it is about healthcare, it’s raising concerns about privacy.’

However, the team behind IBM Debater believe the ability for the machine to argue both sides of the debate means it can never be seen as biased.

‘The computer is certainly less biased than humans,’ argued Dr Slonim.

‘The computer can argue in favour of both positions, so this is one important point.

‘The second is that, [the computer] can help us to take more informed decisions. Let’s say that we are debating or not to legalise cannabis — a topic that I thought about recently.

‘If we’re debating this topic, obviously we need the technology that will be able to quickly suggest to us: what are the relevant claims that people are making with respect to this topic, what is the relevant evidence that people are suggesting in respect to these claims?’

IBM hopes the project will eventually be used to help present an evidence-based argument on a topic that removes any bias, emotion, or ambiguity.

These arguments can be used to help humans make difficult decisions when there is no black-or-white answer.

Like the human participants, the machine did not have any prior knowledge about what the debate was about. Pictured is Noam Slonim, an IBM researcher

Like the human participants, the machine did not have any prior knowledge about what the debate was about. Pictured is Noam Slonim, an IBM researcher

During the live on-stage debate about space, the machine said space exploration was beneficial to the economy. Pictured is Hayah Eichler, a professional debater who has previously debated with IBM Project Debater

During the live on-stage debate about space, the machine said space exploration was beneficial to the economy. Pictured is Hayah Eichler, a professional debater who has previously debated with IBM Project Debater

WHAT WAS SAID DURING THE DEBATE BETWEEN AI AND HUMANS? 

The AI 'Project Debater', created by researchers from IBM, debated alongside two renowned Israeli debaters, Noa Ovadia and Dan Zafrir.

Participants prepared a four-minute opening statement, then a four-minute rebuttal and then a two-minute summary.

The debate topics were 'we should subsidise space exploration' and 'we should increase the use of telemedicine'.

'Hello Dan, thank you for the opportunity to be here today', the machine - which spoke with a confident female voice - said in its opening statement.

During the debate about space, the machine said space exploration was beneficial to the economy.

Ms Ovadia, who was Israel's national debating champion in 2016, said there were more pressing things to spend money on.

Like the human participants, the machine did not have any prior knowledge about what the debate was about.

The system had a few minutes to analyse the human speech before responding.

The machine argued that 'subsidising space exploration usually returns the investment'.

It also said that 'having a space exploration program is a critical part of being a great power.'

'Another point that I believe my opponent made is that there are more important things than space exploration to spend money on,' the machine said.

'It is very easy to say that there are more important things to spend money on, and I do not dispute this', said Project Debator.

'No one is claiming that this is the only item on our expense list. But that is beside the point', it said.

'As subsidising space exploration would clearly benefit society, I maintain that this is something the government should pursue.'

The machine said that subsidising space exploration 'inspires our children to pursue education and careers in science and technology and mathematics.'

'It is more important than good roads or improved schools or better health care,' it said.

The Debator was able to respond to an opponent's argument and undermine it, just like a human might.

It also attempted humour, saying at one point that its 'blood would boil' if it had blood. 

Advertisement

Brhmie Balaram of the RSA suggested IBM Project Debater could be deployed to help make critical decisions in sectors including healthcare, criminal justice, and hiring and firing in the workplace.

Should this happen, Balaram believes the public will be frightened by the lack of human input.

‘What humans want is another human in the loop,’ she told BBC Radio 4.

‘They don’t just want the machine to make a decision of a prediction on its own, they want a human to be there to be able to take that into consideration as part of a number of other factors that they use to make a decision.’

Project Debater is still a research project, however, the company claims that some of the technology which underpins the new system has already started to seep into other IBM projects.

IBM is no stranger to grand, public displays of what it claims is superior artificial intelligence.

The Armonk-based company debuted its Watson supercomputer by trouncing human contestants on the US television gameshow Jeopardy, back in 2011. Before that, the company’s Deep Blue system beat world chess champion Garry Kasparov.

HOW ARTIFICIAL INTELLIGENCES LEARN USING NEURAL NETWORKS

AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn.

ANNs can be trained to recognise patterns in information - including speech, text data, or visual images - and are the basis for a large number of the developments in AI over recent years.

Conventional AI uses input to 'teach' an algorithm about a particular subject by feeding it massive amounts of information.   

AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn. ANNs can be trained to recognise patterns in information - including speech, text data, or visual images

AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn. ANNs can be trained to recognise patterns in information - including speech, text data, or visual images

Practical applications include Google's language translation services, Facebook's facial recognition software and Snapchat's image altering live filters.

The process of inputting this data can be extremely time consuming, and is limited to one type of knowledge. 

A new breed of ANNs called Adversarial Neural Networks pits the wits of two AI bots against each other, which allows them to learn from each other. 

This approach is designed to speed up the process of learning, as well as refining the output created by AI systems. 

The comments below have not been moderated.

The views expressed in the contents above are those of our users and do not necessarily reflect the views of MailOnline.

We are no longer accepting comments on this article.