Just wondering if you have constant liar algorithm implemented internally for hyper-parameter suggestions in parallel. If I understand wandb API correctly, it is geared more for sequential suggestions, and considering our company can run pods in parallel, would be amazing if you guys can implement this on your end, rather than us hacking it on our end.
The basic idea is that for the first pod (in a parllel set) it will suggest the hyper-parameters as usual, but for the 2nd and other pods starting now in parallel, it will send back the worst loss it has currently seen. The logic being that the next suggested hyper-parameters will be far away from ones suggested to first one. You could probably be smarter here since wandb has access to loss metrics as it trains, but that would be a side project.
Here is a link with more depth: nni/ParallelizingTpeSearch.md at 98f66f76d310b0e0679823d966fdaa6adafb66c2 · microsoft/nni · GitHub
Edit 1: follow up question, do you use anything more advanced than sklearn GPs for bayes search (basing my question on this).
Hi Sachin, we currently use the Baye’s implementation (https://github.com/wandb/sweeps/blob/master/src/sweeps/bayes_search.py) for running parallel sweeps. I can put a ticket for this request, but can you tell me why your company prefers the Constant Liar algorithm?
Hi @lesliewandb, Thanks for getting back to me. I feel like the CL algorithm is independent of whether it is Bayes Search or TPE.
Consider this example. Suppose we have conducted 1000 sweeps already, so the sampler is more or less confident about the parameter space. The problem with the current implementation is that if I were to spin up the next 5 sweeps in parallel, wandb ignores the fact they are happening in parallel and would independently suggest 5 hyper-parameters. There is a good chance that all these 5 parameters are extremely similar. However, if each sample “lies” and says that it was a bad location, it would force the sampler to look at a different location and makes sure we can “explore” the hyper-parameter space better.
So basically what I’m asking for is for the constant liar algorithm on top of bayes search. I do however, think that TPE algorithm is better than sklearn’s GP’s but that can be another discussion.
I see, thank you for the clarification! I’ll create a ticket for this and I’ll let you know if there’s any updates
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.