How do data types influence hardware (CPU / GPU / TPU) performance?

I am currently dealing with a relatively big data set, for which I have some memory usage concerns. I am dealing with most of the different data types : floats, integers, Booleans, characters strings for which some were changed to factors.

From a memory usage (RAM) standpoint it is pretty clear what is happening when I switch column type from float64 to float32: the memory usage is divided by two. For string conversion to factor, it is a bit less clear but I get the general idea that regarding to memory theiy are delt with as small integer and remapped if necessary. It doesn't really matter, as they generally delt with in pre-processing by being encoded as numerical columns some way or another.

Now what I am wondering is how do that data types influence calculations further down the pipeline (pre-processing, calibration). I somehow got the idea that changing data types from float 64 to float 32 would allow for twice faster calculations on CPU, but this seems to be varying wildly. So, I am wondering, for what that matters (pre-processing, calibration), how does data types impact the hardware performance ?

Some clarifications / simplyfiying assumptions :

  • I have the intuition that some weirds things may happen in the CPU if you mix data types or regarding convergence if you go too low on memory usage (using float16). So let's assume we are just dealing with float64 to float32 conversion.

  • I am trying to get a langage / library / model agnostic answer. I am mostly using Python + pandas if that matter (to be honest I've never really had to handle data types in R), Python + cudf for GPU. If that matter I am using tensorflow + keras to build a NN. Notably I've called tf.keras.backend.set_floatx('float32'). Similarly I've carrefully made sure that the pre-processing pipeline is handling float32 data types correctly and that code is vectorized.

  • As I am getting involved in Kaggle competitions with differents accelerator (GPU / TPU), I'd like to get an answer that cover the main kind of hardware (CPU / GPU / TPU). I hope that the answer don't really depends on brands.

Topic hardware performance bigdata

Category Data Science


The variation in performance gains you are seeing from reduction precision might be do to different frameworks using different types. Even after downcasting data types, some operations will automatically upcast types. You mention using Pandas and TensorFlow / Keras. Mixing these frameworks leads to unwanted recasting of data types. It is better to use a single framework to avoid recasting.

There is no software (i.e., language, library, or model) agnostic answer. Hardware utilization is software specific. In order to get the reduced precision gains, the same software stack should be throughout the entire modeling process. That software should be designed for the specific chip (i.e., CPU, GPU, or TPU).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.