Artificial Neural Networks (ANNs) are machine learning algorithms inspired by biological neurons inside human brains. The neuron serves as the basic unit of ANN that takes input variables, plugs them into mathematical functions, and produces an output.
ANN consists of three main layers: input, hidden, and output layers. Hidden layers are layers of neurons between the input and output layer. One hidden layer could have multiple neurons and one neural network could have multiple hidden layers.
ANNs are suitable for supervised learning with a dataset that has many attributes and independent from target functions variable types. In other words, the target functions may be discrete-valued, real-valued, or even a vector of discrete-valued and real-valued variables.
This picture describes a simple ANN with two inputs (x1 and x2), one hidden layer consisting of two neurons (A and B), and an output layer consisting of one neuron (C). Variable x1 and x2 are inputs for neurons A and B and the result of Neuron A and B are the inputs for Neuron C, hence creating a network.
Each neuron contains a specific function called the Activation Function. The goal of the activation function is to normalize the input variables to values within certain boundaries. Example of Activation Functions includes Sigmoid Function, ReLu Function, SoftMax Function, etc.
The Activation Functions are crucial in ANN since they determine the transformation of the values for producing the output, thus indirectly affecting the accuracy and efficiency of the model.
Moreover, each input variable owns a distinct weight for each neuron (w1A, w1B, w2A, w2B, wA, wB). These weights are the learning variables for ANN, meaning the values of the weights are trained using the training data to generalize the data patterns. Backpropagation Algorithm iteratively updates weights within the ANN-based on Loss Function used for performance measures.
The backpropagation algorithm utilizes the Chain Rule of derivation and Gradient Descent Concept to revise each weight. However, this algorithm is prone to overfitting the training data, hence cross-validation methods are necessary to provide stopping criteria
Comments