OSError: Input/output error

Hi everyone,
while using wandb to log the metrics of my model (written using PyTorch), I randomly get an exception during the training phase. It is still unclear to me why and when this happens, but it causes my runs to stop which is quite annoying.

Any ideas? I really appreciate any help you can provide!

Traceback (most recent call last):
  File "/usr/local/anaconda3/lib/python3.8/logging/__init__.py", line 1085, in emit
    self.flush()
  File "/usr/local/anaconda3/lib/python3.8/logging/__init__.py", line 1065, in flush
    self.stream.flush()
OSError: [Errno 5] Input/output error
Call stack:
Exception in thread OutRawRd-stderr:
Traceback (most recent call last):
  File "/usr/local/anaconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/local/anaconda3/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/sender.py", line 1027, in _output_raw_reader_thread
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/sender.py", line 1042, in _output_raw_flush
    self._output_raw_file.write(data.encode("utf-8"))
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/lib/filesystem.py", line 64, in write
  File "/usr/local/anaconda3/lib/python3.8/threading.py", line 890, in _bootstrap
    self._bootstrap_inner()
  File "/usr/local/anaconda3/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/internal_util.py", line 49, in run
    self._run()
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/internal_util.py", line 100, in _run
    self._process(record)
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/internal.py", line 264, in _process
    self._hm.handle(record)
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/handler.py", line 131, in handle
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/internal/handler.py", line 139, in handle_request
Message: 'handle_request: partial_history'
Arguments: ()
    super().write(b"\n".join(ret) + b"\n")
  File "/homes/llumetti/alveolar_canal_base/venv/lib/python3.8/site-packages/wandb/sdk/lib/filesystem.py", line 31, in write
    self.f.flush()
OSError: [Errno 5] Input/output error

Hi Luca!

Could you please disable WandB and see if you run into the same issue still?

Cheers!
Artsiom

Hi Luca,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Weights & Biases

Hi Luca, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!