online runs stuck at “wandb: - 0.000 MB of 0.007 MB uploaded”
offline runs work well but if I use “wandb sync ”, it will stuck at "Syncing: Weights & Biases … "
I repeated this phenomenon on another computer with latest version wandb (version 0.17.4)
I can successfully ping wandb.ai and browse wandb pages with Chrome.
Hey @llk19, thank you for writing in. To confirm, you are running offline and online runs and neither of them are uploading to wandb. Are you able to see your metrics upload in real time and does it only not upload the files?
Are both of the devices you ran this on connected to the same network?
Hi there, I wanted to follow up on this request. Please let us know if we can be of further assistance or if your issue has been resolved.
Hi artsiom, thank you for your help. The error disappeared 6 days ago. And I don’t meet it again after then.
Hi!
Today I am running into the same problem (wandb-0.17.0 and wandb-0.17.6).
When I train the model for 1 or 10 Epochs, everything works fine.
However, when I train for more Epochs, ie 40) wandb gets stuck at the upload:
It seems wandb cannot finish the download procedure when it is almost complete
What could be the reason here? This didnt happen when I did my last trainings, about 3 weeks ago.
Thanks in advance!
merged posts----------------------------------
I am having the same issue as @volateq yesterday and today. The training freezes (no failing and no progress) after 10 hours of training.