Hey,
I get this error when i try to train my model using wandb:
CommError: Run initialization has timed out after 90.0 sec. Please refer to the documentation for additional information: Frequently Asked Questions About Experiments
This is the content of debug.log:
2024-02-27 16:22:01,728 INFO MainThread:953563 [wandb_setup.py:_flush():76] Current SDK version is 0.16.2
2024-02-27 16:22:01,728 INFO MainThread:953563 [wandb_setup.py:_flush():76] Configure stats pid to 953563
2024-02-27 16:22:01,728 INFO MainThread:953563 [wandb_setup.py:_flush():76] Loading settings from /linkhome/rech/geniri01/ulf92ec/.config/wandb/settings
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_setup.py:_flush():76] Loading settings from /gpfsdswork/projects/rech/aib/ulf92ec/DSI-QG-main/wandb/settings
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_setup.py:_flush():76] Applying setup settings: {‘_disable_service’: False}
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {‘program’: ‘’}
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_init.py:_log_setup():526] Logging user logs to /gpfsdswork/projects/rech/aib/ulf92ec/DSI-QG-main/wandb/run-20240227_162201-s27b6c1e/logs/debug.log
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_init.py:_log_setup():527] Logging internal logs to /gpfsdswork/projects/rech/aib/ulf92ec/DSI-QG-main/wandb/run-20240227_162201-s27b6c1e/logs/debug-internal.log
2024-02-27 16:22:01,750 INFO MainThread:953563 [wandb_init.py:init():566] calling init triggers
2024-02-27 16:22:01,751 INFO MainThread:953563 [wandb_init.py:init():573] wandb.init called with sweep_config: {}
config: {}
2024-02-27 16:22:01,751 INFO MainThread:953563 [wandb_init.py:init():616] starting backend
2024-02-27 16:22:01,751 INFO MainThread:953563 [wandb_init.py:init():620] setting up manager
2024-02-27 16:22:01,752 INFO MainThread:953563 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
2024-02-27 16:22:01,753 INFO MainThread:953563 [wandb_init.py:init():628] backend started and connected
2024-02-27 16:22:01,763 INFO MainThread:953563 [wandb_run.py:_label_probe_notebook():1294] probe notebook
2024-02-27 16:22:01,763 INFO MainThread:953563 [wandb_run.py:_label_probe_notebook():1304] Unable to probe notebook: ‘NoneType’ object has no attribute ‘get’
2024-02-27 16:22:01,763 INFO MainThread:953563 [wandb_init.py:init():720] updated telemetry
2024-02-27 16:22:01,765 INFO MainThread:953563 [wandb_init.py:init():753] communicating run to backend with 90.0 second timeout
2024-02-27 16:23:31,817 ERROR MainThread:953563 [wandb_init.py:init():779] encountered error: Run initialization has timed out after 90.0 sec.
Please refer to the documentation for additional information:
2024-02-27 16:23:33,832 ERROR MainThread:953563 [wandb_init.py:init():1194] Run initialization has timed out after 90.0 sec.
Please refer to the documentation for additional information:
Traceback (most recent call last):
File “/linkhome/rech/geniri01/ulf92ec/.local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py”, line 1176, in init
run = wi.init()
^^^^^^^^^
File “/linkhome/rech/geniri01/ulf92ec/.local/lib/python3.11/site-packages/wandb/sdk/wandb_init.py”, line 785, in init
raise error
wandb.errors.CommError: Run initialization has timed out after 90.0 sec.
Please refer to the documentation for additional information:
Any ideas why i get this??