IBM artificial Intelligence Product Project Debater and debater Dan Zafri
Last week, IBM launched the
At the same time, many industry people also questioned the core technology and business value of Debater: how does Debater operate? Which machines are used to learn technology? How to achieve accurate speech recognition and Natural Language Processing? How is the corresponding data training set organized? Tencent technology exclusive interview with IBM Project Debater related teams, and answered these questions one by one.
IBM Project Debater search arguments through a given topic
In debates with the debater, themes or arguments come from common arguments in Wikipedia.
Given a Debater theme, the system can search for a lot of knowledge and find the most relevant views and evidence to support or refute the theme. Then, the machine will pick out the most convincing, most diverse and best supported arguments and organize them to form a complete and persuasive narrative.
Debater can find out the arguments of the positive and negative two parties by structure, and do not lean towards either side. As long as there is sufficient evidence and evidence in the corpus, Project Debater can argue on any subject, as the right side, or as the opposite of the debate.
IBM Project Debater covers a large amount of corpus database
So how do we get these corpus for the topic? IBM Project Debater responsible team said: Debater uses the IBM Watson Text to Speech and Speech to Text corpus, covering more than 300 million sources of information, including articles in the main stream newspapers and magazines that are used by professionals in the global business, legal, academic, and government institutions since 2011 ( Including Wikipedia).
In order to train machine to learn deeply on corpus data, IBM developed several benchmark datasets. Some of these datasets focus on computational argumentation tasks, while others are related to the broader Natural Language Processing (NLP) research community. These include: 19276 pairs of readable Wikipedia entries, 5000 idioms with emotional annotations, 3000 annotated sentences, 2394 classified arguments about 55 topics, and 60 professional debates on controversial topics (including drafts and unrevised versions).
In addition, manual participation was added to the training process. The head of the Debater team commented on this:
IBM Project Debater has three big machine learning abilities
In order to develop Project Debater, the IBM research team has given the system three machine learning capabilities:
The first is to write and express the speech draft under the data driven, so that the computer can understand a large number of corpus, so that the machine can write well structured speech on the subject, express clearly and pertinence, and even show humorous humor at the right time.
Second is listening comprehension, which can identify important concepts and viewpoints hidden in long continuous oral English. In addition, the team simulates some of the difficulties of the machine, such as using unique knowledge expression to simulate a human dispute and difficult situation, so that the system can put forward the principle of the argument according to the needs.
IBM Project Debater will first apply to the two commercial scenarios.
The Debater team says that in the future, Project Debater's core technology may be applied to careers such as financial advisors and lawyers.
Financial analysts can support or oppose financial analysts' thinking about financial investment choices by using the advantages and disadvantages of machines. The lawyer is looking for relevant cases and opinions through the Project Debater summary technique to understand the relationship between the relevant content and the case in hand, and to study the more appropriate legal precedents that can be used in the court.
Although from the core technology and business model, the future application potential of Debater is great. But one of the big issues we need to note is that media reporters and analysts in the audience believe that, while the human debater is better represented overall, AI delivers more information than the human debater. And in the course of last week's debate, the AI system did do something.
In a word, whether IBM Project Debater really has the recognized judgment and commercialization needs not only the real practical application recognition of the industry, but also the breakthrough of more AI related technologies. IBM Project Debater is another attempt by IBM in AI field. Started.
Tencent science and technology Li Haidan review Sun Shi