How Int8 (byte) operations can be useful for deep learning?

Nvidia is planing to add hardware support for int8 operations to their titan card and target deep learning. I am trying to understood how its useful and what types of network will benefit from this.

I know that FP16 instead of FP32 is what should be useful for DL, but not sure how int8 could do. There are some research that you can train with full FP32 precision and then round it to one byte - but this does not speedup or reduce memory footprint for training.

Topic tensorflow theano deep-learning

Category Data Science


The 8-bit integer could be enough to train the neural network. This link says that Intel successfully trained ResNet-50 with 8-bit integer only, using some techniques.

While this link is about post-training quantization. It aims to train in fp32 (or fp16), and make inference on uint8. It is already being used in Tensorflow Lite, and does not need any specific techniques.


Actually, recently people have been trying much lower precision in neural nets: 1-2-5 scheme (1 bit weights, 2 bit activations, and 5 bit gradients) seems to work well for easy datasets (MNIST and CIFAR-10). However, on ImageNet the results are significantly lower than those with full-precision (16 or 32 bit). To achieve state of the art, convnets don't need more than 16 bits for training, but current RNNs might need more. For inference, on ImageNet, 4-5 bit weights (stochastically rounded from full precision) should be enough.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.