What is the correct way to resume a paused or crashed run?

Hi I am new to using WandB. I have my project setup with Tensorflow and am logging to WandB by syncing my Tensorboard wandb.init(project='my-project', sync_tensorboard=True).

Sometimes this run may crash or I have to pause the run to retrieve certain artifacts. Then when the run reinitiates how do I ensure that this is not logged as a new run in WandB? but instead just a continuation of the previous one. The step counters also seem to be reset when this happens, even though the step counters are accurate in tensorboard

Hi @amnikhil, thanks for writing in! Here you can have a look at out docs about resuming runs but basically you need to set arguments resume and run_id when calling the init function as wandb.init(id=run_id, resume="must"). Please let me know if this is useful for you!

