Represent Neural Network as matrix calculation (Transformer Feed Forward NN)
for better understanding, I would like to represent the calculations in a neural network with one hidden layer and one output layer as a matrix calculation. The hidden layer has 3072 neurons, the output layer 768. Specifically, we are talking about the Feed Forward Neural Network, which is used in BERT's Transformer architecture. There, a vector 1x768 is fed into the FFNN and the target is an output vector with 1x768 as well.
The calculation would be as follows :
Input Hidden Layer Output Layer Output Vector
[1x768] x [768 x 3072] x [3072 x 768] = [1 x 768]
Is this correct? Thank you very much!
Topic bert transformer matrix neural-network
Category Data Science