Use the same parameter but produce different results in Bayesian Sweep

zjli · April 5, 2023, 3:48am

I was trying to use Sweep for hyperparameter tuning. And I want to do grid sweep in tuning. Coincidently I happend to use the Bayes Sweep (since last time I use the bayes for tuning). Then something weird happened.

I can understand the Bayes search may choose same combination of hyperparameters, But why the same hyperparameters come into different results? And I check my code, I definitely have set the seed. Is there anything I missed?
And this is the yaml config I use:

method: bayes
project: classify
name: roberta-large
metric:
goal: maximize
name: best_valid_metric
parameters:
task:
values: [“emotion”]
batch_size:
values: [8, 16, 32]
plm_learning_rate:
values: [1e-5, 2e-5, 3e-5, 4e-5, 5e-5]
other_learning_rate:
values: [1e-4, 2e-4, 3e-4, 4e-4, 5e-4]
dropout:
values: [0, 0.3, 0.5]
model_name:
value: 1
num_labels:
value: 8
command:

${env}
${interpreter}
${program}
“–use_wandb”
${args}

system · April 5, 2023, 8:29am

Dear Zhuojun,

would you be able to confirm if you are using wandb server locally or our public cloud offering?

This is a known bug that has now been fixed in our latest version of wandb serve 0.31.0 which was released yesterday.

Upgrading to this version should fix the issue that you are experiencing with sweep combinations (repetition) of what should be permutations of parameters.

Warm regards,

Frida

zjli · April 5, 2023, 1:57pm

OK, Thank you. I use the public cloud offering for hyperparameter tuning yesterday. Maybe it was not updated then. But I’m still curious that why the same parameter choices come into different result in bayes sweep
What’s important is that I can’t reproduce the best result in the pic I have shown:smiling_face_with_tear: Althrough the several training results keep consistent on my machine when I try to find out whether there were faults in my code.

zjli · April 8, 2023, 7:34am

Well, finally find the problem. I used the LSTM in my code. And there are some “non-determinism issues for RNN functions on some versions of cuDNN and CUDA.”

system · April 12, 2023, 7:02pm

Dear Zhuojun,

Thanks for sharing your insights on Cuda’s non-deterministic behavior, I was not aware of this myself. I also wanted to follow up on your question about Bayesian sweeps and advice that we currently don’t offer a method to select using a random state and as such there will be variability in for example maximizing a particular metric.

I will add ensure that your use case is added to a feature request for this.

Please let us know if there is anything else that you would like assistance with at this time.

Best,

Frida

delvtryp · April 13, 2023, 11:58am

Hi Frida, is there any work around to prevent this in the public cloud version? I’m having agents repeat parameter combinations 5 + times after only 3-5 completed runs. This is in a search space of only 120 combinations so using bayes is currently seeming to be more pain than it’s worth.

system · April 18, 2023, 2:47pm

Hi Hubert,

Thank you for messaging and sorry that you’re not getting the behavior that you are anticipating. I wonder if you would be able to share the config that you are using so I can spin it up on my side?

I think it is technically possible for Bayesian sweep to arrive at repeat parameters if the best set of parameters is quickly reached, and would be super curious/grateful if you’d be able to advise if the parameters that you are seeing do reflect the most accurate models.

Best,

Frida

system · April 20, 2023, 9:50am

Hi Hubert,

Wanted to check in – I see that on the original thread, this was marked as solved – I can look into this further if this is helpful but would be great if you’d be able to share the config that you are using either .yaml or python dictionary format.

Look forward to hearing back from you.

Frida

system · April 21, 2023, 9:33am

Hi Hubert,

Going to go ahead and close this off for you as we’ve not heard back. Let me know if you need any further help now or in the future.

Best,

Frida

system · June 12, 2023, 11:58am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Does a bayes search use info from past sweeps? W&B Help sweeps , wandb	22	1185	March 25, 2024
Bayesian sweep repeating the runs W&B Help sweeps , wandb	2	559	July 23, 2023
Wandb sweep have unreproducible results W&B Help sweeps	3	34	August 2, 2024
Sweeps with a set number of seeds for the same hyperparameter values W&B Help sweeps	2	588	December 2, 2023
Bayes method for searching in sweep W&B Help	5	570	June 22, 2023

Use the same parameter but produce different results in Bayesian Sweep

Related topics