What is an artificial neural network? Here's everything you need to know

What is an artificial neural network? Here’s everything you need to know

By Luke Dormehl January 5, 2019

If you’ve spent any time reading about artificial intelligence, you’ll almost certainly have heard about artificial neural networks. But what exactly is one? Rather than enrolling in a comprehensive computer science course or delving into some of the more in-depth resources that are available online, check out our handy layperson’s guide to get a quick and easy introduction to this amazing form of machine learning.

What is an artificial neural network?

Artificial neural networks are one of the main tools used in machine learning. As the “neural” part of their name suggests, they are brain-inspired systems which are intended to replicate the way that we humans learn. Neural networks consist of input and output layers, as well as (in most cases) a hidden layer consisting of units that transform the input into something that the output layer can use. They are excellent tools for finding patterns which are far too complex or numerous for a human programmer to extract and teach the machine to recognize.

While neural networks (also called “perceptrons”) have been around since the 1940s, it is only in the last several decades where they have become a major part of artificial intelligence. This is due to the arrival of a technique called “backpropagation,” which allows networks to adjust their hidden layers of neurons in situations where the outcome doesn’t match what the creator is hoping for — like a network designed to recognize dogs, which misidentifies a cat, for example.

Another important advance has been the arrival of deep learning neural networks, in which different layers of a multilayer network extract different features until it can recognize what it is looking for.

Sounds pretty complex. Can you explain it like I’m five?

For a basic idea of how a deep learning neural network learns, imagine a factory line. After the raw materials (the data set) are input, they are then passed down the conveyer belt, with each subsequent stop or layer extracting a different set of high-level features. If the network is intended to recognize an object, the first layer might analyze the brightness of its pixels.

Image used with permission by copyright holder

The next layer could then identify any edges in the image, based on lines of similar pixels. After this, another layer may recognize textures and shapes, and so on. By the time the fourth or fifth layer is reached, the deep learning net will have created complex feature detectors. It can figure out that certain image elements (such as a pair of eyes, a nose, and a mouth) are commonly found together.

Once this is done, the researchers who have trained the network can give labels to the output, and then use backpropagation to correct any mistakes which have been made. After a while, the network can carry out its own classification tasks without needing humans to help every time.

Beyond this, there are different types of learning, such as supervised or unsupervised learning or reinforcement learning, in which the network learns for itself by trying to maximize its score — as memorably carried out by Google DeepMind’s Atari game-playing bot.

How many types of neural network are there?

There are multiple types of neural network, each of which come with their own specific use cases and levels of complexity. The most basic type of neural net is something called a feedforward neural network, in which information travels in only one direction from input to output.

A more widely used type of network is the recurrent neural network, in which data can flow in multiple directions. These neural networks possess greater learning abilities and are widely employed for more complex tasks such as learning handwriting or language recognition.

There are also convolutional neural networks, Boltzmann machine networks, Hopfield networks, and a variety of others. Picking the right network for your task depends on the data you have to train it with, and the specific application you have in mind. In some cases, it may be desirable to use multiple approaches, such as would be the case with a challenging task like voice recognition.

What kind of tasks can a neural network do?

A quick scan of our archives suggests the proper question here should be “what tasks can’t a neural network do?” From making cars drive autonomously on the roads, to generating shockingly realistic CGI faces, to machine translation, to fraud detection, to reading our minds, to recognizing when a cat is in the garden and turning on the sprinklers; neural nets are behind many of the biggest advances in A.I.

Broadly speaking, however, they are designed for spotting patterns in data. Specific tasks could include classification (classifying data sets into predefined classes), clustering (classifying data into different undefined categories), and prediction (using past events to guess future ones, like the stock market or movie box office).

How exactly do they “learn” stuff?

In the same way that we learn from experience in our lives, neural networks require data to learn. In most cases, the more data that can be thrown at a neural network, the more accurate it will become. Think of it like any task you do over and over. Over time, you gradually get more efficient and make fewer mistakes.

When researchers or computer scientists set out to train a neural network, they typically divide their data into three sets. First is a training set, which helps the network establish the various weights between its nodes. After this, they fine-tune it using a validation data set. Finally, they’ll use a test set to see if it can successfully turn the input into the desired output.

Do neural networks have any limitations?

On a technical level, one of the bigger challenges is the amount of time it takes to train networks, which can require a considerable amount of compute power for more complex tasks. The biggest issue, however, is that neural networks are “black boxes,” in which the user feeds in data and receives answers. They can fine-tune the answers, but they don’t have access to the exact decision making process.

This is a problem a number of researchers are actively working on, but it will only become more pressing as artificial neural networks play a bigger and bigger role in our lives.