Skip to main content

Machine-learning system aggregates knowledge by surfing web for information

ai surfs the web learning tv screens
Here in 2016, we have a data problem — but it’s far from the data problem people experienced in previous decades. Instead of having a dearth of information, the problem users face today is there is simply too much information available and distilling it into one manageable place is a necessity.

That is the challenge researchers at the Massachusetts Institute of Technology set out to solve with a new piece of work, which won the “best paper” award at the Association for Computational Linguistics’ Conference on Empirical Methods on Natural Language Processing in November.

Related Videos

The work seeks to turn conventional machine-learning techniques upside down by offering a new approach to information extraction — which allows an AI system to turn plain text into data for statistical analysis and improve its performance by surfing the web for answers.

“This method is similar to the way that we as humans search for and find information,” Karthik Narasimhan, a graduate student at MIT’s Department of Electrical Engineering and Computer Science, told Digital Trends. “For example, if I find an article with a reference I can’t understand, I know that to understand it I need more training. Since I have access to other articles on the same topic, I’d perform a web search to get additional information from different sources to gain a more informed understanding. We want to do the same thing in an automated scenario.”

MIT’s machine-learning system works by giving information a measure of statistical likelihood. If it determines that it has low confidence about a piece of knowledge, it can automatically generate an internet search inquiry to find other texts to fill in the blanks. If it concludes that a particular document is not relevant, it will move onto the next one. Ultimately, it will extract all of the best pieces of information and merge them together.

The system was trained to extract information by being asked to compile information on mass shootings in the U.S., as part of a potential study on the effects of gun control and food contamination. In each scenario, the system was trained on around 300 documents and instructed to extract information answering a number of queries — which it managed to successfully do.

“We used a technique called reinforcement learning, whereby a system learns through the notion of reward,” Narasimhan said. “Because there is a lot of uncertainty in the data being merged — particularly where there is contrasting information — we give it rewards based on the accuracy of the data extraction. By performing this action on the training data we provided, the system learns to be able to merge different predictions in an optimal manner, so we can get the accurate answers we seek.”

Going forward, Narasimhan said that the research could have myriad applications. For instance, it could be used to scan various news reports and compile a single fact-heavy document, combining data from multiple sources.

It could equally be used in the medical profession. “This could be a great tool for aggregating patient histories,” he said. “In cases where a lot of doctors write different things about treatments a patient has gone through — and each has a different way of writing about it — this technology could be used to distill that information into a more structured database. The result could mean that doctors are able to make better, more informed decisions about a patient.”

Just another exciting, groundbreaking day in the world of machine learning!

Editors' Recommendations

The funny formula: Why machine-generated humor is the holy grail of A.I.
microphone in a bar

In "The Outrageous Okona," the fourth episode of the second season of Star Trek: The Next Generation, the Enterprise's resident android Data attempts to learn the one skill it has previously been unable to master: Humor. Visiting the ship’s Holodeck, Data takes lessons from a holographic comedian to try and understand the business of making funny.

While the worlds of Star Trek and the real world can be far apart at times, this plotline rings true for machine intelligence here on Earth. Put simply, getting an A.I. to understand humor and then to generate its own jokes turns out to be extraordinarily tough.

Read more
Read the eerily beautiful ‘synthetic scripture’ of an A.I. that thinks it’s God
ai religion bot gpt 2 art 4

Travis DeShazo is, to paraphrase Cake’s 2001 song “Comfort Eagle,” building a religion. He is building it bigger. He is increasing the parameters. And adding more data.

The results are fairly convincing, too, at least as far as synthetic scripture (his words) goes. “Not a god of the void or of chaos, but a god of wisdom,” reads one message, posted on the @gods_txt Twitter feed for GPT-2 Religion A.I. “This is the knowledge of divinity that I, the Supreme Being, impart to you. When a man learns this, he attains what the rest of mankind has not, and becomes a true god. Obedience to Me! Obey!”

Read more
This tech was science fiction 20 years ago. Now it’s reality
Hyundai Wearable Exoskeleton, assistive tech

Twenty years really isn’t all that long. A couple of decades ago, kids were reading Harry Potter books, Pixar movies were all the rage, and Microsoft’s Xbox and Sony’s PlayStation were battling it out for video game supremacy. That doesn’t sound all that different from 2021.

But technology has come a long way in that time. Not only is today’s tech far more powerful than it was 20 years ago, but a lot of the gadgets we thought of as science fiction have become part of our lives. Heck, in some cases, this technology has become so ubiquitous that we don’t even think about it as being cutting-edge tech.

Read more