Encountering network error when running sweep

I’ve been running sweeps without issue for the past few weeks. This morning, however, I ran into the issue below.

(my_env) ➜  scripts git:(main) ✗ wandb agent usr/dir/xxxx
wandb: Starting wandb agent 🕵️
wandb: Network error (ReadTimeout), entering retry loop.

I don’t see any issues on https://status.wandb.com/. There also doesn’t seem to be any issues on my side connecting to any other external resource (e.g., github, wgeting files.)

Hi Ivan,

Could you send me the debug logs that were generated after you tried tunning this sweep?

Cheers,
Artsiom

Is there a specific place I should be looking for the related log? I have the wandb directory that contains the logs for each of my prior runs, but in this case, I can’t even get the run to start. It just hangs after displaying wandb: Network error (ReadTimeout), entering retry loop..

Edit, more info:

  1. The sweep I’m having issues with finished at a previous time. A few runs in this sweep crashed so I want to rerun them. I deleted the crashed runs, resumed the sweep in Sweep Controls, and launched my agents as usual before encountering this network error.

  2. I upgraded to wandb-0.15.0, but the issue persists.

  3. I’m able to run new sweeps from scratch.

Dang, since there are no logs, do you happen to have some reproducible code, as well as all of the logs for the error when you are running into it?

Hi Ivan,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Hi Ivan,

Since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know! Seems like the tickets don’t reopen right now when you write back in for some reason, so please make a new thread and link this thread in there.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.