How to version data science projects with large files

Question

How to version data science projects with large files

fireball.1

2021年1月4日 11:14

I am working on a project with large data files (~300MB). I want to version my work along with the data files so that it is always available online. I tried using git-lfs but it has a 1GB/month bandwidth limit, beyond which you're blocked for a month.

What are versioning tools using by data scientists for projects that use 100MB data files (both static and generated)?

Topic version-control

Category Data Science

Craig · Accepted Answer · 2021年1月4日 11:14

1

Craig answered at 2021年1月4日 11:14

I have used dvc. It has data versioning though I do not use that often. I tend to use the makefile (pipeline) features.

How to version data science projects with large files

About