Sometimes (for example in RL) agents are very unstable and you only know how a config behaves if you tested it on 5-10 seeds. So I was wondering if there is a feature in wandb sweeps that allows the aggregation of a metric over multiple seeds (but the same config values)?
I know one solution is to define a for loop in my own training script that repeats the same config, but I would like these runs to be executed in parallel, and possibly even on different machines.
Hi @tomjur, this is an interesting question!
This is possible if you use the seed as a parameter in your sweep config and then group based on the parameters you care about.
There are a bunch of ways to group your runs and aggregate your metrics. You can click the group button above your runs and choose the metrics you care about, or you can group within each plot by editing a plot and clicking the Group tab and choosing the parameter you want to group on.
Sorry, but I still don’t understand the solution. Maybe I can try to be more explicit in my description of the problem (in the following I assume bayes \ random search since I do not have the budget to do a grid search):
Let’s say I defined a distribution over all parameters, and specifically, I defined 3 uniform values for seeds. Now if an agent samples a configuration with seed 1, what forces the hyper-parameter optimization process (in the wandb controller) to select the same configuration again but with seed no. 2?
If the optimization is random, it is possible but not likely (especially with a continuous random variable), if the optimization is bayes it is unlikely but might also have the bad side-effect of preferring easy seeds.
I think this might be a common pain point in RL (and possibly GAN) sweeps.
Thanks for sharing more detail about your question. I understand now. In essence, you want a grid search over random seeds while also doing a bayes / random search for your other configs, and make sure the configs are the same for each Sweep.
It isn’t currently possible to do this, but this is definitely a feature we will try to support in the future because it’s a common workflow for people like you trying to deterministically run Sweeps. Thank you for adding another +1 to this feature request.
Unfortunately, my only suggestion for now is doing a grid search for each of the configurations you want to test, including the random seed in that search.