How to distinguish resumed runs during sweeps?

I’m looking into WandB’s Sweep feature for my next project and am currently trying to implement the resume-mechanism.

I use the following code to restore my model:

wandb.init(resume=True)

if wandb.run.resumed:
    model = wandb.restore("last.ckpt")
else:
    model = ... # instantiate new model

However, wandb.run.resumed is apparently always True, since the wandb agent sets the WANDB_RUN_ID-environment variable, so restore fails for new runs. What is a good way to handle this?

Hi,
Sorry this was missed, I have forwarded this to support.

Hi @cschell,

I just tested this on my end wandb.run.resumed is only True when the last run which had been run in the directory had exits with a nonzero exit code. When the previous run exits with a zero exit code, wandb.run.resumed is False.

I suspect you might always be getting True because the previous run crashes on wandb.restore. Could you try instantiating a new run which creates “last.ckpt” and then try resuming?

Thanks,
Ramit

Hi @cschell,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Weights & Biases

Hi Christian, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.