How Int8 (byte) operations can be useful for deep learning?
Nvidia is planing to add hardware support for int8 operations to their titan card and target deep learning. I am trying to understood how its useful and what types of network will benefit from this.
I know that FP16 instead of FP32 is what should be useful for DL, but not sure how int8 could do. There are some research that you can train with full FP32 precision and then round it to one byte - but this does not speedup or reduce memory footprint for training.
Topic tensorflow theano deep-learning
Category Data Science