How to distinguish resumed runs during sweeps?

I’m looking into WandB’s Sweep feature for my next project and am currently trying to implement the resume-mechanism.

I use the following code to restore my model:


    model = wandb.restore("last.ckpt")
    model = ... # instantiate new model

However, is apparently always True, since the wandb agent sets the WANDB_RUN_ID-environment variable, so restore fails for new runs. What is a good way to handle this?

Sorry this was missed, I have forwarded this to support.

Hi @cschell,

I just tested this on my end is only True when the last run which had been run in the directory had exits with a nonzero exit code. When the previous run exits with a zero exit code, is False.

I suspect you might always be getting True because the previous run crashes on wandb.restore. Could you try instantiating a new run which creates “last.ckpt” and then try resuming?


Hi @cschell,

