I’ve been using wandb sweeps and I found that after each run is finished, the following message shows up
wandb: Waiting for W&B process to finish… (success)
but then 10 minutes pass with nothing happening.
Only after this long time wandb shows the run history and summary and starts a new run.
I’m using wandb in a Gradient Paperspace notebook, running it from the terminal.
I’ve found anyone else with this issue, so it may be something wrong at my side.
Do you have any idea of what the problem might be?
Hi @ogait , in your project workspace, under the overview page for the sweep/runs associated with the sweep, what is the status of those sweeps/runs? Additionally, we can take a look at your debug bundles to verify if there is anything that is causing issues. They are the
debug-internal.log files located in the working directory of the project inside the
wandb folder of the runs. Please provide logs for the runs where you are seeing issues.
Hello Mohammad thanks for the response.
When this happens, both the sweep and the run status is “running”.
Here is an example of
debug-internal.log (unfortunately the files are too big to be included in this message): Easyupload.io - Upload files for free and transfer big files easily.
Hi @ogait , thank-you for providing the files. After review it appears this is due to a performance related bug on our end where the run exit response hangs until all the run data syncs, example.
[wandb_run.py:_on_finish():2221] got exit ret: None. This bugs shows up in runs with a lot of .log calls. The bug is currently in Selected For Development. I will update you here when there has been movement.
Thanks for the update Mohammad!
Can I do something to reduce the number of log calls?
I am using fastai with the following callback
WandbCallback(log_model=False, log_preds=False) and at the end of training I use
wandb.summary to save six simple variables.
This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.