Average performance over seeds for Bayesian hyperparameter optimizer

kyrylo-boiko · August 5, 2025, 10:51pm

Dear W&B Support Team,

In ML, a single seed’s performance is not a reliable metric to feed to a Bayesian optimizer. This is especially pronounced in RL, where the instability is so high that a SOTA method can perform worse than the baseline on an unlucky run, and vice versa[1]. This renders the W&B Bayesian optimizer currently unusable (or rather, unscientific to use) for tuning RL hyperparams.

I believe the community has voiced support for this feature numerous times in many threads (below), but no updates or even workarounds were ever provided. I hope this post becomes a +1 and a summary of previous requests, and finally moves the needle on this feature.

Personally, not being able to use the W&B bayes optimizer means I have to settle for the random option, or spend considerable time implementing a seed-aggregate wrapper.

Previous requests:

Reference:

[1] - Eimer, Theresa, Marius Lindauer, and Roberta Raileanu. ‘Hyperparameters in Reinforcement Learning and How To Tune Them’. arXiv:2306.01324. Preprint, arXiv, 2 June 2023. [2306.01324] Hyperparameters in Reinforcement Learning and How To Tune Them .

ml-reports · August 5, 2025, 10:52pm

I’m sending this off to the support team to see what can be done.

kiante · October 10, 2025, 6:24pm

We have a solution in the Kempner Institute handbook; feel free to submit an issue to our GitHub repository if you encounter any problems. github/KempnerInstitute/optimizing-ml-workflow/tree/main/workshop_exercises/wandb_aggregate

Topic		Replies	Views
Sweeps with multiple seeds for the same config values W&B Help wandb	8	2973	April 20, 2022
Use the same parameter but produce different results in Bayesian Sweep W&B Help sweeps , wandb	9	1594	June 12, 2023
Sweeps with a set number of seeds for the same hyperparameter values W&B Help sweeps	2	607	December 2, 2023
Averaging objective over various seeds in sweeps W&B Help sweeps	3	628	August 19, 2023
I wonder if the utility to aggregate over multiple seeds was added or not in later releases W&B Help sweeps , wandb	4	566	June 1, 2023

Average performance over seeds for Bayesian hyperparameter optimizer

Related topics