How to store efficiently very large sparse 3D matrices
To train a CNN, I have stacked arrays of images over observations [observations x width x length]
. The dataset is very sparse ($95\%$). What would be an efficient way of storing these matrices efficiently in terms of
- format (e.g. pickle, parquet)
- structure (e.g.
scipy.sparse.csr_matrix
, List of Lists)
Topic cnn data-formats bigdata
Category Data Science