Optimisation of neural networks

Question

Optimisation of neural networks

Domenico Bagnato

2021年3月5日 19:17

Do neural networks get optimized by trial and error, by data scientists, or is there some way of optimizing values through accurate mathematical equations?

Topic graph-neural-network neural convolutional-neural-network rnn neural-network

Category Data Science

noe · Accepted Answer · 2021年3月5日 19:17

Neural networks are trained with a mix of mathematical optimization and trial and error exploration:

Neural networks are comprised of trainable parameters. These parameters are trained with some variant of stochastic gradient descent (SGD). Trainable parameters include those in dense layers, convolutional layers, attention layers, LSTMs, etc.
There are other aspects of neural networks that cannot be trained but are equally decisive in the performance of the network. They are known as hyperparameters. Some of these hyperparameters are the number and size of filters in convolutional layers, the number of layers, the dimensionality of embeddings, etc. To decide which hyperparameters values are optimal, you either choose them "by intuition", or explore different combinations of values and check which combination performs best (e.g. random search, grid search), or apply some sort of black-box optimization (Bayesian optimization, genetic algorithms, etc).

Oxbowerce · Accepted Answer · 2021年3月5日 17:01

The most often used method of optimising neural networks is a process called (stochastic) gradient descent. You provide the network the inputs and expected output, during training the model outputs get compared to the expected output. The difference between the two is what is called the error or loss. Based on how wrong or right the network is, you can calculate how should adjust the parameters/internal weights of the model to lower the error of the model. A more in-depth explanation of gradient descent (and neural networks in general) can be found on this site.

Optimisation of neural networks

About