Hey,
I used wandb normally before, But today, I suddenly found that wandb.errors.html: Run initialization has timed out after 90.0 sec when using wandb.init() function.
I can switch to other networks and use other servers without any problems, It was only when I used the server and PC host in this network environment that I had problems.(On PC, I can visit Weights & Biases: The AI Developer Platform, but I can’t get to my project)
This is the log I get when using wandb.init() with an error:
2024-06-13 16:31:03,355 INFO StreamThr :5911 [internal.py:wandb_internal():85] W&B internal server running at pid: 5911, started at: 2024-06-13 16:31:03.353793
2024-06-13 16:31:03,359 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: status
2024-06-13 16:31:03,361 INFO WriterThread:5911 [datastore.py:open_for_write():87] open: /data/puchun.liu/BIOT/wandb/run-20240613_163103-q6udt10e/run-q6udt10e.wandb
2024-06-13 16:31:03,362 DEBUG SenderThread:5911 [sender.py:send():379] send: header
2024-06-13 16:31:03,370 DEBUG SenderThread:5911 [sender.py:send():379] send: run
2024-06-13 16:31:08,371 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:13,372 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:18,374 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:23,377 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:28,378 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:33,380 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:38,382 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:43,384 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:48,386 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:53,388 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:58,389 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:03,394 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:07,229 INFO SenderThread:5911 [retry.py:__call__():172] Retry attempt failed:
Traceback (most recent call last):
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
self.sock = conn = self._new_conn()
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 179, in _new_conn
raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/lib/retry.py", line 131, in __call__
result = self._call_fn(*args, **kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/internal/internal_api.py", line 340, in execute
return self.client.execute(*args, **kwargs) # type: ignore
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute
result = self._get_result(document, *args, **kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result
return self.transport.execute(document, *args, **kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/lib/gql_request.py", line 58, in execute
request = self.session.post(self.url, **post_args)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 637, in post
return self.request("POST", url, data=data, json=json, **kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/adapters.py", line 507, in send
raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)'))
2024-06-13 16:32:08,396 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:13,398 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:18,400 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:23,401 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:28,403 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:33,405 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: cancel
2024-06-13 16:32:33,405 DEBUG SenderThread:5911 [sender.py:send():388] Record cancelled: run
2024-06-13 16:32:33,406 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: status_report
2024-06-13 16:32:33,407 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: cancel
2024-06-13 16:32:35,415 DEBUG HandlerThread:5911 [handler.py:handle_request():158] handle_request: shutdown
2024-06-13 16:32:35,415 INFO HandlerThread:5911 [handler.py:finish():882] shutting down handler
2024-06-13 16:32:36,407 INFO SenderThread:5911 [sender.py:finish():1608] shutting down sender
2024-06-13 16:32:36,408 INFO WriterThread:5911 [datastore.py:close():296] close: /data/puchun.liu/BIOT/wandb/run-20240613_163103-q6udt10e/run-q6udt10e.wandb