When I call run.finish()
in the jupyter notebook, it doesn’t finish the run that was initialized by run = wandb.init(project="myproject", resume=True)
. Instead, it runs forever, until Ctrl+C. The run then cannot be finished correctly in any way.
I have found several similar issues here in the forums, but it always looked like some big artifacts were uploading. It’s not my case. I don’t have any big artifacts. Instead, I just did a few simple calls using LangChain to GPT.
Very rarely, it works ok. For example just now, I’ve upgraded from wandb 0.15.7 to 0.15.8, and suddenly the first run that I tried could be finished OK. But then any further run cannot be finished.
These are the last lines in debug.log:
2023-08-03 16:14:23,573 INFO MainThread:3124 [wandb_init.py:_pause_backend():418] pausing backend
2023-08-03 16:14:26,417 INFO MainThread:3124 [wandb_init.py:_resume_backend():423] resuming backend
2023-08-03 16:27:28,618 INFO MainThread:3124 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2023-08-03 16:27:28,618 INFO MainThread:3124 [wandb_init.py:_pause_backend():418] pausing backend
2023-08-03 16:27:49,824 INFO MainThread:3124 [wandb_init.py:_resume_backend():423] resuming backend
2023-08-03 16:27:49,825 INFO MainThread:3124 [wandb_run.py:_finish():1894] finishing run [the name of my run]
2023-08-03 16:27:49,825 INFO MainThread:3124 [jupyter.py:save_history():445] not saving jupyter history
2023-08-03 16:27:49,825 INFO MainThread:3124 [jupyter.py:save_ipynb():373] not saving jupyter notebook
2023-08-03 16:27:49,826 INFO MainThread:3124 [wandb_init.py:_jupyter_teardown():435] cleaning up jupyter logic
2023-08-03 16:27:49,826 INFO MainThread:3124 [wandb_run.py:_atexit_cleanup():2128] got exitcode: 0
2023-08-03 16:27:49,826 INFO MainThread:3124 [wandb_run.py:_restore():2111] restore
2023-08-03 16:27:49,827 INFO MainThread:3124 [wandb_run.py:_restore():2117] restore done
and this is in debug-internal.log:
2023-08-03 16:36:30,058 DEBUG SenderThread:3149 [sender.py:send_request():406] send_request: poll_exit
2023-08-03 16:36:31,058 DEBUG HandlerThread:3149 [handler.py:handle_request():144] handle_request: poll_exit
2023-08-03 16:36:31,058 DEBUG SenderThread:3149 [sender.py:send_request():406] send_request: poll_exit
2023-08-03 16:36:32,058 DEBUG HandlerThread:3149 [handler.py:handle_request():144] handle_request: poll_exit
Apparently it’s just growing with the same lines - now some 5 minutes after the run.finish()
call which can be seen from debug.log.
I’m running it on a simple default Azure VM with Ubuntu and Python 3.11.4.
Could you please give me some tips where the problem can be? It’s a very annoying issue, almost blocking me from using WandB at all, because every run that I create, I have to then delete Otherwise, it remains in the “Running” state (and there is no way how to finish it in the web interface).
Thank you very much