By Ishai Rosenberg
In our last post, “From Machine Learning to Deep Learning, And What It Means for Cybersecurity – Part 1,” we introduced deep learning, how it evolved, and its real-world application.
In today’s second installment of our four-part series, we will cover the neural networks that power deep learning.
Before diving into the neural networks of artificial intelligence, let’s first take a look at the original neural network that is the inspiration for all this deep learning – the human brain.
The human brain consists of different regions, each of which performs a specific task, and each consisting of dozens of billions of small processing units known as neurons, which are connected to each other via synapses.
The ‘deep learning’ of the human neural network happens when input arrives to the brain, neurons process it, and the more the input, the more and better it learns. For example, when the optic nerve transfers signals (the input) from our eyes to a certain region in the brain (the processing area), the neurons in that area learn to process these signals, enabling us with the ability to see.
Everything human beings learn, everything we remember, everything we do, is the result of this kind activity in the brain. Research suggests that the reason why human beings are considered more intelligent than other animals is the fact that we have many more neurons in the main processing area of our brains, i.e. the cerebral cortex.
The original neural network (the brain) and how it processes information, where the more the neurons, the better (smarter) – is the very inspiration for the artificial neural networks that are at the heart of deep learning.
The concept of artificial neural networks is based on the idea that ‘signals’ can be received from other ‘neurons’ in the format of numerical values –e.g., the raw bytes of a memory file.
And, when attempting to bolster the capabilities of processing these input numbers by the neural networks, it was discovered that if more layers of neurons, known as ‘hidden layers’, were added between the input and output layers, then the network could achieve much greater data processing capabilities.
Then, taking things even further, in the 1980s when a sophisticated algorithm called ‘backpropagation’ was developed, state-of-the art training of deep neural networks was finally made possible.
The complex principle of training neural networks with backpropagation can be explained with a relatively simple cat metaphor.
Let me explain.
Imagine that you’d like to train a “cat detector.” You would assemble a training dataset with 10,000 images containing cats, plus 10,000 images that do not contain cats.
All these images are of size 30 x 30 pixels, and they are in grayscale (that means a single value describes each pixel, rather than three values for red, green, and blue).
A neural network for this classification task would contain an input layer, a few hidden layers, and an output layer. In this example, the input layer could contain 900 neurons, which is 30 times 30, as is determined by the number of pixels. The output layer should contain two neurons (one neuron representing the “no cat” class, and the other neuron representing a “cat” class).
Here we would have the simplest type of network, which is a fully connected neural network, and which means that all the neurons in each layer are connected to all the neurons in the subsequent layer.
As the neural network is initialized, it doesn’t yet have any useful knowledge. Given an image, it won’t do any better than a coin toss when it comes to detecting whether the image contains a cat or not. It obtains the knowledge about how to recognize a cat through the training process. The training is done in two stages:
When you’ve finished training a neural network, you must further test the accuracy using a set of samples that weren’t used during the training. This set is the test set, and its purpose is to make sure the network has not simply “memorized” the training set without learning the principles behind it.
Once the results are satisfactory, the neural network can be used for real-world predictions.
Deep learning is by far the hottest field within artificial intelligence today. That’s pretty remarkable when you consider that until a few years ago, just about everyone other than a few research groups had completely given up on neural nets.
However, during the past few years, several innovations have helped overcome the potential barriers to the effectiveness of deep learning, and which now allow researchers to train deeper neural networks.
important contributor to the greater proliferation of neural networks is the use of graphics processing units (GPUs) instead of central processing units (CPUs). This is because GPUs are designed as massive parallel processing units, which excel at performing the same instructions over many different data values in parallel. And, this is exactly what neural networks must do.
Accordingly, with the advancements in the algorithms used in deep learning along with the use of GPUs, deep learning on neural networks has become a new and formidable force to be reckoned with when it comes to the great promise of artificial intelligence.
Meantime, to learn more about how deep learning works and why it’s revolutionizing how cybersecurity is done, we invite you to download the Deep Learning for Dummies book >>