Differential Evolution optimal tolerance parameter
I am trying to optimize the parameters of a global optimization system for my set of data, because I will have a bunch of similar data to process so I need to fine tune the global optimizator so that it can find the minimum of a function in a reasonable time.
The parameter optimization takes a very long time, and it only has to be done once after which I will just use the best combination and just run those on the other similar datasets, hopefully finding the best value there too.
The global optimizator that I use is called differential evolution and I use the python/numpy/scipy package implementation of it. Here is the wikipedia definition and the relevant papers in the references.
The problem is that it's extremely slow to sample enough combinations of the parameters to find any kind of trend which would suggest me and kind of pattern that I should follow. My problem is especially with the tol
parameter which is defined as:
np.std(pop) = atol + tol * np.abs(np.mean(population_energies))
Which exits the optimizer if the standard deviation of the population goes below the average of the population energy.
The tricky part is that if the tol=0.01
as per default then the optimizer exits quickly and gives you a low accuracy value, so I run it multiple times, if I set it to a very low number then it doesn't exit at the above criteria but it exits at the maxiter
value the maximum number of iterations the function is allowed to do. So either I run it with a big tolerance many times or with a small tolerance few times, the question is which exit clause is better?
What I am measuring is which tol
value can give me the best value within a given amount of time, currently 2 hour per set, and I group them as {1e-01,1e-02,...,1e-10} batches, testing which batch will give me the best value in the given time.
I am testing it now as we speak however the low tolerance values are very hard to sample because it takes ~30min for one to be done, so I would need a sample size of 200 from each which would take just too much to compute. My data is 15 dimensional and I only allow 90000 evaluations in total per run (15*popsize*maxiter), which with a low tolerance rate like 1e-10 gets hit very often, which makes drawing samples very slow.
Is there any other way to find the best tol parameter, I would guess a higher one is better because exiting with a condition is more efficient than just brute forcing through an arbitrary set of cycles, but only if the shape of my data is favorable enough, and I have no way of knowing.
Is there any literature on what the rule of thumb best parameters are for the differential evolution global optimizer, that on average is optimal for all functions?
I suspect the optimal range to be around [1e-4,1e-0] but I have no way of proving it, but if it's true then optimizing the other parameters would be a piece of cake as calculations within this tolerance range are very quick.
Topic metaheuristics evolutionary-algorithms scipy optimization python
Category Data Science