BrokenPipeError: [Errno 32] Broken pipe

Hi,

I have the problem that I get this error after a while when I run a sweep. With single trainings the error does not occur and also the first runs always go.
I use keras in an ipynb in vs code over a wsl2 and ubuntu.

Error in callback <function _WandbInit._resume_backend at 0x7f760b06d940> (for pre_run_cell):

BrokenPipeError Traceback (most recent call last)
File ~/miniconda3/envs/tf/lib/python3.9/site-packages/backcall/backcall.py:104, in callback_prototype…adapt…adapted(*args, **kwargs)
102 kwargs.pop(name)
103 # print(args, kwargs, unmatched_pos, cut_positional, unmatched_kw)
→ 104 return callback(*args, **kwargs)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/wandb_init.py:424, in _WandbInit._resume_backend(self)
422 if self.backend is not None and self.backend.interface is not None:
423 logger.info(“resuming backend”) # type: ignore
→ 424 self.backend.interface.publish_resume()

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface.py:672, in InterfaceBase.publish_resume(self)
670 def publish_resume(self) → None:
671 resume = pb.ResumeRequest()
→ 672 self._publish_resume(resume)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface_shared.py:344, in InterfaceShared._publish_resume(self, resume)
342 def _publish_resume(self, resume: pb.ResumeRequest) → None:
343 rec = self._make_request(resume=resume)
→ 344 self._publish(rec)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface_sock.py:51, in InterfaceSock._publish(self, record, local)
49 def _publish(self, record: “pb.Record”, local: Optional[bool] = None) → None:
50 self._assign(record)

→ 130 sent = self._sock.send(data)
131 # sent equal to 0 indicates a closed socket
132 if sent == 0:

BrokenPipeError: [Errno 32] Broken pipe
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings…
Error in callback <function _WandbInit._pause_backend at 0x7f760b06d8b0> (for post_run_cell):

BrokenPipeError Traceback (most recent call last)
File ~/miniconda3/envs/tf/lib/python3.9/site-packages/backcall/backcall.py:104, in callback_prototype…adapt…adapted(*args, **kwargs)
102 kwargs.pop(name)
103 # print(args, kwargs, unmatched_pos, cut_positional, unmatched_kw)
→ 104 return callback(*args, **kwargs)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/wandb_init.py:419, in _WandbInit._pause_backend(self)
417 if self.backend.interface is not None:
418 logger.info(“pausing backend”) # type: ignore
→ 419 self.backend.interface.publish_pause()

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface.py:664, in InterfaceBase.publish_pause(self)
662 def publish_pause(self) → None:
663 pause = pb.PauseRequest()
→ 664 self._publish_pause(pause)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface_shared.py:340, in InterfaceShared._publish_pause(self, pause)
338 def _publish_pause(self, pause: pb.PauseRequest) → None:
339 rec = self._make_request(pause=pause)
→ 340 self._publish(rec)

File ~/miniconda3/envs/tf/lib/python3.9/site-packages/wandb/sdk/interface/interface_sock.py:51, in InterfaceSock._publish(self, record, local)
49 def _publish(self, record: “pb.Record”, local: Optional[bool] = None) → None:
50 self._assign(record)

→ 130 sent = self._sock.send(data)
131 # sent equal to 0 indicates a closed socket
132 if sent == 0:

Thanks for all kinds of help

Hello @opt_peter !

It looks like the sweep had an issue communicating with out backend as indicated by the error stemming from sent = self._sock.send(data). When running sweeps, each run in the sweep will communicate with the wandb backend in order to get the next set of hyperparameters. Are you behind a VPN, load balancer, or proxy that could be blocking this connection? Or perhaps, at the time, was your connection unstable?

Also, there has reports of connection issues due to WSL2, so could you try running your sweep from a Colab instead? This is to isolate if this is a connection issue between WSL2 or not.

Hi Peter, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.