About different structures of neural network

https://www.mathworks.com/help/deeplearning/ref/fitnet.html is the tutorial that I am following to understand fitting data to a function. I have few doubts regarding structure and terminologies which are the following:

1. Model number of hidden layers

By hidden layer we mean the layer that is inbetween the input and output. If number of layers = 1 with 10 hidden neurons (as shown in second figure) then is it essentially a neural network which is termed as an MLP. Is my understanding correct? In general,

  • if the number of hidden layers = 0, we call the NN as a perceptron.
  • If the number of hidden layers >=1 but less than 3, the NN becomes an MLP. Is the picture in the link that of an MLP since it contains 1 hidden layer of 10 neurons?
  • if the number of hidden layers >3, the NN is called as deep NN aka deep learning

Is that correct?

2. Linear vs nonlinear mapping function

The resulting model eventually learn to map the input to output data.

  • Do we call the machine learning model as linear or nonlinear ? Or is this term associated to the mapping function ?
  • Which layer's mapping function determines this? Based on which layer's activation function do we say that the mapping function or the model is linear or nonlinear? For ex, In this picture, the last layer is the output layer and the activation function looks like an identity/linear. But the hidden layer has sigmoid activation function which is nonlinear. Therefore, is this model a nonlinear function?

Topic mlp perceptron terminology neural-network

Category Data Science


I will answer your questions one by one:


By hidden layer we mean the layer that is inbetween the input and output. If number of layers = 1 with 10 hidden neurons (as shown in second figure) then is it essentially a neural network which is termed as an MLP. Is my understanding correct?

The fundamental building block of a Neural Network is the perceptron. It's modeled on biological neurons: a unit that receives multiple inputs, weights them, and outputs a signal transformed. In its simplest form (i.e. a perceptron with sigmoid activation) it's practically identical to a logistic regression.

When you compose many perceptrons on more than one layer you will form an MLP (Multi Layer Perceptron). The term "MLP" is used as a synonim of Neural Networks - in their basic feed forward architectures.

Neural Networks with no hidden layer cannot even be considered as Neural Networks.

The ones with just one hidden layer are called shallow Neural Networks, no one really uses them, since they are not as powerful as Deep Neural Networks.

When you have more than one hidden layer, then you talk about Deep Neural Networks. They are the real big thing right now, the most powerful ML models available right now.


Do we call the machine learning model as linear or nonlinear ? Or is this term associated to the mapping function ?

ML models can be either linear or non-linear, the choice is yours. Different models will map input and output in different ways (linear or other non-linear ways).

I think it's important to stress that Machine Learning is not a set of models, but an approach to data. You could use a linear regression (i.e. a linear model) with a ML approach, or an SVM or a Deep Neural Network (i.e. non-linear models) to solve the exact same problem.


Which layer's mapping function determines this?

If you refer to Deep Neural Networks, I believe all layers altogether are responsible of the non-linear mapping between inputs and outputs. Non-linearities can be learned only thanks ot a combination of depth and non-linear activation functions. The deeper a Network is, the more complex will be the non-linear patterns that it will able to learn. It's like if you need "all of them at once" to do that, in my opinion.


  1. What are you referring is a very subjective term. 1-layer NN could be called perceptron, but when you see it from other perspective, simple logistic regression have similar formulation. When people refers to MLP usually they are referring to simple stacks of perceptron layers, and usually does not make use any fancy functions. When we talk about deep learning it becomes a much broader subject. It is not only about the depth but also its complex design.

  2. For ANN nonlinearities will only applies when you apply it usually in the form of activation function. So if I stack hidden layers without applying activation function then it will only become a linear operator. You can try thinking about this for practice.


  1. Deep learning models are a subset of multi-layer perceptrons. What you consider to be "deep" is subjective - so long as you have multiple hidden layers, you can call it deep and get away with it.

  2. You can associate the terms "linear" and "non-linear" with either the mapping function or the model. A perceptron will always learn a linear boundary between classes (this answer has a good explanation for that). Once you add a hidden layer, and turn the perceptron into a MLP, the model/resulting mapping function will be able to learn non-linear decision boundaries. This happens regardless of the activation functions of the layers.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.