Hi @artsiom I’m getting this error on lambda labs right now.
wandb: Currently logged in as: olicg. Use `wandb login --relogin` to force relogin
Thread HandlerThread:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/internal_util.py", line 49, in run
self._run()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/internal_util.py", line 100, in _run
self._process(record)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/internal.py", line 279, in _process
self._hm.handle(record)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/handler.py", line 136, in handle
handler(record)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/handler.py", line 146, in handle_request
handler(record)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/handler.py", line 683, in handle_request_run_start
self._tb_watcher = tb_watcher.TBWatcher(
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/internal/tb_watcher.py", line 126, in __init__
wandb.tensorboard.reset_state()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/lib/lazyloader.py", line 58, in __getattr__
module = self._load()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/lib/lazyloader.py", line 33, in _load
module = importlib.import_module(self.__name__)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/integration/tensorboard/__init__.py", line 3, in <module>
from .log import _log, log, reset_state, tf_summary_to_dict # noqa: F401
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/integration/tensorboard/log.py", line 35, in <module>
Summary = pb.Summary if pb else None
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/util.py", line 209, in __getattribute__
state.load()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/util.py", line 202, in load
self.module.__spec__.loader.exec_module(self.module)
File "/usr/lib/python3/dist-packages/tensorboard/compat/proto/summary_pb2.py", line 15, in <module>
from tensorboard.compat.proto import histogram_pb2 as tensorboard_dot_compat_dot_proto_dot_histogram__pb2
File "/usr/lib/python3/dist-packages/tensorboard/compat/proto/histogram_pb2.py", line 34, in <module>
_descriptor.FieldDescriptor(
File "/home/ubuntu/.local/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 561, in __new__
_message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
wandb: ERROR Internal wandb error: file data was not synced
Problem at: train.py 75 <module>
wandb: ERROR transport failed
Traceback (most recent call last):
File "train.py", line 75, in <module>
wandb.init(
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 1189, in init
raise e
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 1170, in init
run = wi.init()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 815, in init
run_start_result = run_start_handle.wait(timeout=30)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/lib/mailbox.py", line 281, in wait
raise MailboxError("transport failed")
wandb.sdk.lib.mailbox.MailboxError: transport failed
Traceback (most recent call last):
File "train.py", line 75, in <module>
wandb.init(
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 1189, in init
raise e
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 1170, in init
run = wi.init()
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/wandb_init.py", line 815, in init
run_start_result = run_start_handle.wait(timeout=30)
File "/home/ubuntu/.local/lib/python3.8/site-packages/wandb/sdk/lib/mailbox.py", line 281, in wait
raise MailboxError("transport failed")
wandb.sdk.lib.mailbox.MailboxError: transport failed
wandb: While tearing down the service manager. The following error has occurred: [Errno 32] Broken pipe
have only used pip to install wandb like so:
python -m pip install wandb torch numpy