In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
The Perceptron, also known as the Rosenblatt’s Perceptron. Perceptrons are the most primitive classifiers, akin to the base neurons in a deep-learning system.
What is Perceptron
A Single Neuron
The basic unit of computation in a neural network is the neuron, often called a node or unit. It receives input from some other nodes, or from an external source and computes an output. Each input has an associated weight (w), which is assigned on the basis of its relative importance to other inputs. The node applies a function f (defined below) to the weighted sum of its inputs as shown in Figure 1 below:
Basic elements of Perceptron
- Inputs: X1, X2
- Bias: b
- Synaptic Weights: w1, w2
- Activitation Function: f
- Output: Y
The Perceptron takes a set of input say X, each input may not have signifant to give an output. Hence the concept of weights say set W was introduced. These weights express the importance of each input with respect to output. The function f is non-linear and is called the Activation Function. The
purpose of the activation function is to introduce non-linearity into
the output of a neuron. This is important because most real world data
is non linear and we want neurons to learn these non linear representations. The output is represented by Y, output is the most important in the perceptron, as it gives us our final result.
Activation Function
Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it [2]. There are several activation functions you may encounter in practice:
- Sigmoid: takes a real-valued input and squashes it to range between 0 and 1
σ(x) = 1 / (1 + exp(−x))
- tanh: takes a real-valued input and squashes it to the range [-1, 1]
tanh(x) = 2σ(2x) − 1
- ReLU: ReLU stands for Rectified Linear Unit. It takes a real-valued input and thresholds it at zero (replaces negative values with zero)
f(x) = max(0, x)
The below figures [2] show each of the above activation functions.
Importance of Bias: The main function of Bias is to
provide every node with a trainable constant value (in addition to the
normal inputs that the node receives).
Training of our Perceptron: Backpropogation algorithm
Backward Propagation of Errors, often
abbreviated as BackProp is one of the several ways in which an
artificial neural network (ANN) can be trained. It is a supervised
training scheme, which means, it learns from labeled training data
(there is a supervisor, to guide its learning).
To put in simple terms, BackProp is like “learning from mistakes“. The supervisor corrects the ANN whenever it makes mistakes.
An ANN consists of nodes in different
layers; input layer, intermediate hidden layer(s) and the output layer.
The connections between nodes of adjacent layers have “weights”
associated with them. The goal of learning is to assign correct weights
for these edges. Given an input vector, these weights determine what the
output vector is.
In supervised learning, the training set
is labeled. This means, for some given inputs, we know the
desired/expected output (label).
BackProp Algorithm:
Initially all the edge weights are randomly assigned. For every input in
the training dataset, the ANN is activated and its output is observed.
This output is compared with the desired output that we already know,
and the error is “propagated” back to the previous layer. This error is
noted and the weights are “adjusted” accordingly. This process is
repeated until the output error is below a predetermined threshold.
Once the above algorithm terminates, we
have a “learned” ANN which, we consider is ready to work with “new”
inputs. This ANN is said to have learned from several examples (labeled
data) and from its mistakes (error propagation).
Types of Perceptron
-
Single Layer Perceptron – This is the simplest feedforward neural network and does not contain any hidden layer.
-
Multi Layer Perceptron – A Multi Layer Perceptron has one or more hidden layers. Multi Layer Perceptrons are more useful than Single Layer Perceptons for practical applications today.
Conclusion
A neuron is a mathematical model of the behavior of a single neuron in a biological nervous system.
A single neuron can solve some very simple learning tasks, but the power
of neural networks comes when many of them are connected in a network
architecture.
The architecture of an artificial neural network refers to the number of
neurons and the connections between them.
The following figure shows a feed-forward network architecture of
neurons.
Although in this post we have seen the functioning of the perceptron,
there are other neuron models which have different characteristics and
are used for different purposes.
Some of them are the scaling neuron, the principal components neuron,
the unscaling neuron or the probabilistic neuron.
In the above picture, scaling neurons are depicted in yellow and
unscaling neurons in red.
Comments
Post a Comment