What is the easiest way to duplicate my model and run in 10 cloud machines?
I have built a model in a cloud machine(google cloud). It runs for a few hours to a day. I need to scan some parameters like learning rate and batch size.
How do I duplicate my compute engine, run the model with different parameters, collect the results and turn them off?
Edit: I have a neural network model, it runs for 24 hours. Usually without cloud setting I would do a grid search: learning rate in {0.001, 0.003, 0.1} and batch size in {32,64,128}. This would take 9 days.
With cloud computing I can do this grid search in 24 hours. I need to manually do the followings. Save my original compute engine into a snapshot and create 8 compute engine from the snapshot, start all engines. Run each model with a different parameter. Copy the result to the original compute engine. Close after copy.
The question is how do I automate this? Answers in gcp or aws are welcomed.
Topic cloud-computing
Category Data Science