Where and how to do large scale supervised machine learning?
I'm beginner in ML and I have a large dataset that has 15 features with 6M rows, so it becomes challenging to work on it locally. I can train one model locally but to perform hyper parameter tuning and cross validations with my macbook pro, it runs out of memory and lacks the processing speed and capacity. I tried spark but that gives poor results, so I would prefer python native ecosystem of pandas and sklearn.
So I want to know what are my options? How do professionals do it? Should I provision a VM on cloud with high memory and CPU or there are any other cloud based or SAAS platforms that I can checkout
Topic cloud supervised-learning pyspark random-forest scalability
Category Data Science