Rated 4.0/5
based on 24 customer reviews

Blog

Artificial Intelligence, Machine Learning, Deep Learning, and Neural Networks are all buzzwords right now. This article is the first among the series of planned articles which focuses on understanding and implementing the concepts of deep learning and artificial intelligence.

“Artificial Intelligence is the new electricity. Similar to how electricity revolutionized a number of industries about hundred years ago, Artificial Intelligence will transform and revolutionize different industries” - Andrew NG

Artificial Intelligence and Deep Learning are becoming one of the most important components of modern day businesses. A large number of smart and intelligent systems are regularly built to solve the use cases that were earlier thought to be complex to solve. Some examples include automatic speech recognition in smartphones, conversation chatbots, image classification and clustering in search engines, natural language generation, and understanding.

At a broader level, Artificial Intelligence and Deep Learning aim to create intelligent machines. However, at a much deeper level, it comprises of Mathematical relationships, sophisticated optimization algorithms, and models that generate intelligence in the form of predictions, segmentation, clustering, forecasting, classifications etc.**Neural Networks** are the **building blocks of every Artificial Intelligence** and Deep Learning models. In this **artificial intelligence neural network** **tutorial**, we will understand everything about the neural networks and the science behind them.

Contents:

- Introduction of Neural Networks
- Single Processing Units - Neurons
- Activation Functions
- Forward Propagation
- Backward Propagation

Introduction of Artificial Neural Network terminology

A neural network is a mathematical model that is designed to function similar to biological neurons and nervous system. These models are used to recognize complex patterns and relationships that exist within a labeled data. A labeled dataset contains two types of variables - predictors (some features which are used as independent variables in the model) and target (some features which are treated as dependent variables of the model).

- Employees data containing their information such as age, gender, experience, skills and labeled with their salary amount (an example of numerical data)
- Tweets labeled as positive, negative or neutral (an example of text data)
- Images of Animals labeled with Name of the Animal (an example of image data)
- Audios of Music labeled with the Genere of the Music (an example of audio data)

The core structure of a Neural Network model is comprised of a large number of simple processing nodes which are interconnected and organized in different layers. An individual node in a layer is connected to several other nodes in the previous and the next layer. The inputs from one layer are received and processed to generate the output which is passed on to the next layer.

The first layer of this architecture is often named as input layer which accepts the inputs, the last layer is named as the output layer which produces the output and every other layer between input and output layer is named as hidden layers. Let us now understand how to apply artificial neural network and how it works.

Single Processing Units - Neuron

A neuron is the smallest unit in a neural network. It accepts various inputs, applies a function and computes the output.

Each incoming connection corresponds to different inputs to a node, To each of its connections, the node assigns a number known as a “weight”. The weight of every input variable signifies the importance or the priority of that variable among all other variables. For different values of input variables, the node multiplies the value with its associated weight and adds the resulting products together. It also adds another term called “bias” which helps the learning function to adjust to left or right. This summed number is then passed through an activation function (described in next section) which maps the inputs to the target output values.

Let’s understand this through an example. Take a situation in which you want to purchase a new house and you will make the decision based on the following factors in order of priority.

- Cost of the Property
- Square Feet Area
- Construction Year
- Availability of Security Systems
- Nearby Amenities
- Climate Factors in the Locality
- Crime Rate in the Locality

The best way to formalize the decision making based on this situation is to formulate a mathematical equation with:

- Every factor is represented as x1, x2, x3, …
- Every factor’s priority is represented as the weight: w1, w2, w3
- Node Input is represented as the weighted sum of factors and their weights (Z)
- Node Output is represented as the value of mapping function g (Z)

Inputs: x1,x2, x3, …

Weights: w1, w2, w3, …

Node Bias term: “b”

Node Input (Z) = w1*x1 + w2*x2 + w3*x3 + w4*x4 + … + w7*x7 + b

Node Output (A) = g (Z)

Here the function “g” is known as Activation Function. Let’s understand how this activation function works.

Activation Functions - Applying Non-Linear Transformations

The main goal of an activation function is to apply a non-linear transformation on the input to map it to the output. For example, a linear combination of 7 variables related to the house is mapped into two target output classes: “Buy the Property” and “Do not buy the Property”.

The decision boundary of the output can be given by a threshold value. If the generated value is below a threshold value, the node outputs 0, otherwise 1. The generated outputs (0, 1) belongs to the decision. In our example, A is the generated output (Activation), and “b” is the threshold. If weighted sum of inputs and their weights comes out to be greater than “b” then you will buy the property otherwise not. i.e.

If A > 0, then interpretation = “buy house”

else A < 0, then interpretation = “do not buy”

WX + b > 0, 1(“buy house”)

WX + b < 0, 0(“do not buy”)

This is the basic equation of the activation function which is applied to each neuron. In the above example, we have applied Step Function as the activation function.

There are other choices of nonlinear activation functions such as relu, sigmoid, tanh:

Forward Propagation

Neural Network model goes through the process called forward propagation in which it forward propagates the inputs and their activations through the layers to get the final output. The computation steps involved are:

Z = W*X + b

A = g(Z)

g is the activation function

A is the predicted output using the input variables X.

W is the weight matrix

B is the bias matrix

To optimize the weights and bias matrices used in the neural network, the model computes the error in the output and makes small changes in the values. These changes are made so that the overall error can be reduced. This process of error computation and weights optimization is represented in functions called loss function and cost function.

Error Computation: Loss Function and Cost Function

Loss function measures the error in the prediction and the actual value. One simplistic loss function is the difference between actual value and the predicted value.

Loss = Y - A

Y = Actual Value, A = Predicted Value

Cost function computes the summation of loss function values for every training data example.

Cost Function = Summation (Loss Function)

Minimizing the Cost Function - Backpropagation using Gradient Descent

The final step is to minimize the error term (cost function) and obtain the optimal values of weights and bias terms. A neural net model does this through a process called backpropagation using gradient descent.

In this process, it computes the error of the final layer and back passes it to the previous layer. The previous layer associated weights and biases are adjusted to tackle the error. The values of weights and bias are updated using the process called gradient descent. In this algorithm, the derivative of error in the final layer is computed with respect to each weight. This error derivative is then used to find the derivative of weights and bias which are then subtracted from the original values to get the updated new values.

Then, the model again forward propagate to compute the new error with new weights and then will backward propagate to update the weights again. The process is repeated several times to achieve the minimum error term. This process is also termed as training. During training, the weights and thresholds are continually adjusted until training data with the same labels consistently yield similar outputs.

Recap

To train a neural network model, following steps are implemented.

- Design the architecture of a neural network model with a number of layers, number of neurons in each layer, activation functions etc.
- For every training example in the input data, compute the activations using activation functions.
- Forward propagate the activations through the input layers to hidden layers to the output layer
- Compute the error of the final layer using Loss Function
- Compute the sum of errors for every training example in the input data using Cost Function
- Backpropagate the error to the previous layers and compute the derivative of error with respect to weights and bias parameters.
- Using Gradient Descent algorithm subtracts the weight and bias derivative terms from the original values.
- Perform this operation for a large number of iterations (epochs) to obtain a stable neural network model.

In this article, we discussed the overall working of an AI neural network model. In the next article, we will discuss how to implement a neural network in Python. Feel free to share your comments.

Rated 4.5/5
based on 12 customer reviews

Rated 4.5/5
based on 6 customer reviews

Rated 4.0/5
based on 20 customer reviews

- Firefox Quantum Version Released
- Big Data And Hadoop certification in Singapore
- Sencha Touch course in Berlin
- Hadoop Administration course in Perth
- Announcing Updates To Google'S Iot Platform
- Master Turbogears course in Sydney
- Java Deep Dive course in Chicago
- Chef certification in Chennai
- Mariadb Administration classroom training in Gurgaon
- React Native online training in Boston

## Leave a Reply

Your email address will not be published. Required fields are marked *