How to override existing statistics in plot when resuming from a checkpoint?

chulabhaya · January 23, 2023, 11:33am

Hello all! I’m trying to set up proper training checkpointing and resuming for my code and thus far I’ve gotten things to work but there is still one thing I am trying to figure out, which is how to get the logs in wandb to get overwritten/replaced after I load a checkpoint.

For instance, right now in my code if I save a checkpoint at 5000 timesteps, let training run for a few more thousand timesteps, cancel it, and then load and resume training from that 5000 step checkpoint, a training plot will look like this:

This is because the built-in Wandb Step value didn’t also reset back to 5k for when I restarted training from the 5k checkpoint, it just kept going. What I would instead like to have happen is that the Step value is synced with when I save the checkpoint so that when I resume, the existing plot is overridden, rather than continued. Is it possible to do this? Thanks in advance!

luis_bergua · January 24, 2023, 10:59am

Hi @chulabhaya, thanks for your question! If I’m understanding you properly, you’re resuming your run with wandb.init(id='run_id', resume='allow') and then trying to override losses/alpha_loss with wandb.log({'losses/alpha_loss':value}, step=x) where x is an already logged step? If this is the case this isn’t possible now as when you resume a run you can only log from the latest step (or a higher one). I can create a new feature request for this if you want, just explain me your use-case to share the full context with our Product Team. Thanks!

luis_bergua · January 26, 2023, 2:01pm

Hi Chulabhaya,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Luis

system · March 25, 2023, 10:59am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Resume logging and deleting specific steps W&B Help dashboard , wandb	4	904	January 8, 2024
Overwrite previously logged metrics when resuming a run W&B Help wandb	6	954	March 14, 2025
Wandb Resume Logging W&B Help dashboard , wandb , beginner-friendly	3	1989	February 12, 2023
How to continue a specific run after stopping? W&B Help wandb	7	6690	June 12, 2022
Bug: After resuming run at a given step, previously logged images disappear W&B Help wandb	0	57	February 28, 2025

How to override existing statistics in plot when resuming from a checkpoint?

Related topics