I’m trying to run a bayesian hyperparameter sweep for some inference parameters. I’ve run the sweep initialization command and then started 6 parallel workers on 6 different GPUs. These do one run and then something happened at the second run because, even though the runs are completed and the results saved on the wandb table, the runs start repeating themselves (the third run has the same hyperparameters of the second one and appears to even be the same run on the wandb table, so the total count of runs is always 12).
This is the sweep which I’m happy to share privately if useful.
Thank you very much for the help,
Hello @gcorso !
Could you send the debug logs of two runs that have the same hyperparameters?
They should be located in the
wandb folder in the same directory as where the script was run. The
wandb folder has folders formatted as
run-DATETIME-ID associated with a single run. Could you retrieve the
debug-internal.log files from one of these folders specifically from the run that is duplicating hyperparameters? If you don’t feel comfortable posting this here, feel free to email the debug logs to email@example.com and title the email “For Raphael - Debug Logs” and I can handle it from there.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.