I’ve been using WANDB for hyperparameter tuning sweeps and successfully set up my code to run it from my local machine. I’m utilizing Spyder and a GPU for each sweep on my local machine. My goal was to conduct 50 sweeps, but after just two, I encountered the error below that caused all subsequent sweeps to crash. Initially, I suspected it might be a Windows firewall issue, but having ensured proper access and seeing the first two sweeps run successfully, I doubt it’s the cause. If it were a firewall issue, I wouldn’t have been able to run even two sweeps.
Could you advise on how to resolve this issue? Each sweep takes about one or two hours. Also, is there a limit on the time allowed for logging data to your database?
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
can someone please help me with this. I am not able to run more than 2 sweeps as it immediately crashes for the rest of all sweeps. This is frustrating and I do not know how to fix it.
Hi @bestwayyy, Could you let us know what version of wandb you are using?
Also, could you search through the debug-internal.log file from the local run folder of a run that is hitting this and share any relevant errors?
There isn’t a limit on time that the API can make calls so I don’t think this is happening from the server side. Do you go through a proxy or any other network infrastructure? I agree that Firewall would only make sense if you ran into this every time but there might other network infra that is timing out and closing the connection.
Hi @bestwayyy, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!