Skip to main content

Machine-learning system aggregates knowledge by surfing web for information

Here in 2016, we have a data problem — but it’s far from the data problem people experienced in previous decades. Instead of having a dearth of information, the problem users face today is there is simply too much information available and distilling it into one manageable place is a necessity.

That is the challenge researchers at the Massachusetts Institute of Technology set out to solve with a new piece of work, which won the “best paper” award at the Association for Computational Linguistics’ Conference on Empirical Methods on Natural Language Processing in November.

Recommended Videos

The work seeks to turn conventional machine-learning techniques upside down by offering a new approach to information extraction — which allows an AI system to turn plain text into data for statistical analysis and improve its performance by surfing the web for answers.

“This method is similar to the way that we as humans search for and find information,” Karthik Narasimhan, a graduate student at MIT’s Department of Electrical Engineering and Computer Science, told Digital Trends. “For example, if I find an article with a reference I can’t understand, I know that to understand it I need more training. Since I have access to other articles on the same topic, I’d perform a web search to get additional information from different sources to gain a more informed understanding. We want to do the same thing in an automated scenario.”

MIT’s machine-learning system works by giving information a measure of statistical likelihood. If it determines that it has low confidence about a piece of knowledge, it can automatically generate an internet search inquiry to find other texts to fill in the blanks. If it concludes that a particular document is not relevant, it will move onto the next one. Ultimately, it will extract all of the best pieces of information and merge them together.

The system was trained to extract information by being asked to compile information on mass shootings in the U.S., as part of a potential study on the effects of gun control and food contamination. In each scenario, the system was trained on around 300 documents and instructed to extract information answering a number of queries — which it managed to successfully do.

“We used a technique called reinforcement learning, whereby a system learns through the notion of reward,” Narasimhan said. “Because there is a lot of uncertainty in the data being merged — particularly where there is contrasting information — we give it rewards based on the accuracy of the data extraction. By performing this action on the training data we provided, the system learns to be able to merge different predictions in an optimal manner, so we can get the accurate answers we seek.”

Going forward, Narasimhan said that the research could have myriad applications. For instance, it could be used to scan various news reports and compile a single fact-heavy document, combining data from multiple sources.

It could equally be used in the medical profession. “This could be a great tool for aggregating patient histories,” he said. “In cases where a lot of doctors write different things about treatments a patient has gone through — and each has a different way of writing about it — this technology could be used to distill that information into a more structured database. The result could mean that doctors are able to make better, more informed decisions about a patient.”

Just another exciting, groundbreaking day in the world of machine learning!

Luke Dormehl
Former Digital Trends Contributor
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Apple’s ChatGPT rival may automatically write code for you
A slide of Xcode running on MacOS Monterey at Apple's WWDC 2021 event

Artificial intelligence (AI) tools like ChatGPT and Bing Chat have exploded in popularity over the past year, yet industry titan Apple has remained conspicuously quiet on the matter. Now, though, we might know what could be in store for us if the Cupertino firm decides to launch its own AI chatbot.

In a recently granted patent (#US-11687830-B2), Apple explains how it could infuse machine learning (ML) tech into its Xcode app, which may allow it to automatically write code for developers. If successful, that could be a major boost for app builders who work within Apple’s ecosystem -- and could mean better apps for users.

Read more
This web browser integrates ChatGPT in a fascinating new way

It’s no secret that artificial intelligence (AI) and chatbots have taken the tech world by storm in recent months. Now, the Opera browser is trying to get in on the action by releasing Opera One, which it dubs “the first AI-powered browser.”

Opera (the company) describes it as “the latest incarnation of the Opera browser,” one that has been given a “major makeover.” The company “reimagined and rebuilt Opera from the ground up,” it says, “paving the way for a new era in which AI isn’t just an add-on, but a core part of your browsing experience.”

Read more
Photoshop AI thinks ‘happiness’ is a smile with rotten teeth
Phil Nickinson, as edited by Adobe Photoshop's Neural Filter.

You can't swing a dead cat these days without running into AI. And nowhere is that more true than in photography. I've certainly had fun with it on more than my share of photos. But the more I attempt to be a "serious" photographer, the less I want to rely on artificial intelligence to do my job for me.

That's not to say it doesn't have its place. Because it does. And at the end of the day, using AI filters isn't really any different than hitting "auto" in Photoshop or Lightroom and using those results. And AI certainly has its place in the world of art. (Though I'd probably put that place somewhere way in the back, behind the humans who make it all possible in the first place.)

Read more