Resume logging and deleting specific steps


I have some issues when trying to resume my wandb logging, but at specific steps.

[Problematic Situataion]
For example,

  1. I have trained my network for 200K iterations, and logged everything up to this iteration on wandb, BUT without explicitly specifying the current step.
  2. I have found some wrong configurations in my scheduler.
  3. Thus, I resume the overall training, starting from 150K iterations.

[The problem]
In this case, since I have not explicitly specified the steps to wandb_run.log(), the log continues from 200K, while I’m training my network at 150K. Thus, the steps of the logs and the steps of the actual training show inconsistency.

[What I want to do]
I want to delete the logs from 150K to 200K manually, so that I can resume logging in an appropriate step, without explicitly indicating the current step to wandb_run.log().

Thank you.

Hi @2minkyulee , please note that step must be monotonically increasing in each call, otherwise the step value is ignored during your call to log() . If you attempt to resume at an older step, we will ignore this and continue at last known step of the run. You may want to look into custom axis to see if this is would help you in defining your x-axis incremtations, see here. If you are still running into problems, provide us with a toy script of how you are currently training /resuming runs, and highlight where you are running into logging issues, and we’ll take a look. Thanks

Hello @2minkyulee ,

We would like to follow up about this case if you are still running into logging issues?

Hi @2minkyulee , since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.