Inference speed of ReLU networks
I'm fairly new in the topic, and I was wondering whether some of you can point to existing works in which the inference of deep neural networks with ReLU activation functions is tested on GPUs as a function of the number of hyperparameters. Just to have a rough idea on how fast those networks can give an answer back for, e.g., approximation/regression purposes.