Wandb takes too much time after each run ends

ogait · August 16, 2022, 9:43pm

I’ve been using wandb sweeps and I found that after each run is finished, the following message shows up

wandb: Waiting for W&B process to finish… (success)

but then 10 minutes pass with nothing happening.
Only after this long time wandb shows the run history and summary and starts a new run.
I’m using wandb in a Gradient Paperspace notebook, running it from the terminal.

I’ve found anyone else with this issue, so it may be something wrong at my side.
Do you have any idea of what the problem might be?

mohammadbakir · August 18, 2022, 12:58am

Hi @ogait , in your project workspace, under the overview page for the sweep/runs associated with the sweep, what is the status of those sweeps/runs? Additionally, we can take a look at your debug bundles to verify if there is anything that is causing issues. They are the debug.log and debug-internal.log files located in the working directory of the project inside the wandb folder of the runs. Please provide logs for the runs where you are seeing issues.

ogait · August 18, 2022, 1:28am

Hello Mohammad thanks for the response.
When this happens, both the sweep and the run status is “running”.
Here is an example of debug.log and debug-internal.log (unfortunately the files are too big to be included in this message): Easyupload.io - Upload files for free and transfer big files easily.

mohammadbakir · August 26, 2022, 2:00am

Hi @ogait , thank-you for providing the files. After review it appears this is due to a performance related bug on our end where the run exit response hangs until all the run data syncs, example. [wandb_run.py:_on_finish():2221] got exit ret: None. This bugs shows up in runs with a lot of .log calls. The bug is currently in Selected For Development. I will update you here when there has been movement.

ogait · August 26, 2022, 9:25am

Thanks for the update Mohammad!
Can I do something to reduce the number of log calls?
I am using fastai with the following callback WandbCallback(log_model=False, log_preds=False) and at the end of training I use wandb.summary to save six simple variables.

system · October 25, 2022, 9:25am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

mohammadbakir · June 26, 2024, 6:02pm

Hi Tiago,

A few years ago, you reached out to us regarding wandb syncing data at a very slow rate. I am pleased to announce that our engineering team has worked hard to rebuild our SDK from the ground up with a focus on significant performance improvements with up to 88% gain when logging through multiple processes. To try out our new SDK, upgrade to wandb ≥ v0.17.3 and add wandb.require("core") to your scripts for improved logging performance. We would love to have you try it out and give us your thoughts on the impact it has had on your experiment runs. If you have any questions please let us know.

Regards,
Mohammad

Topic		Replies	Views
Taking forever to finish after Waiting for W&B process to finish... (success) W&B Help wandb	8	3965	September 8, 2023
Run.finish() hangs W&B Help	5	1441	July 3, 2023
Wandb background services keep running even after my code ends W&B Help sweeps	3	1387	April 25, 2023
Wandb.finish() takes too long to finish W&B Help wandb	2	789	July 16, 2023
Waiting for W&B process to finish... (success) W&B Help	12	4565	March 3, 2023

Wandb takes too much time after each run ends

Related topics