It is possible to link wandb.ai but using wandb.init() will timeout the connection

Hey,
I used wandb normally before, But today, I suddenly found that wandb.errors.html: Run initialization has timed out after 90.0 sec when using wandb.init() function.
I can switch to other networks and use other servers without any problems, It was only when I used the server and PC host in this network environment that I had problems.(On PC, I can visit Weights & Biases: The AI Developer Platform, but I can’t get to my project)
This is the log I get when using wandb.init() with an error:

2024-06-13 16:31:03,355 INFO    StreamThr :5911 [internal.py:wandb_internal():85] W&B internal server running at pid: 5911, started at: 2024-06-13 16:31:03.353793
2024-06-13 16:31:03,359 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: status
2024-06-13 16:31:03,361 INFO    WriterThread:5911 [datastore.py:open_for_write():87] open: /data/puchun.liu/BIOT/wandb/run-20240613_163103-q6udt10e/run-q6udt10e.wandb
2024-06-13 16:31:03,362 DEBUG   SenderThread:5911 [sender.py:send():379] send: header
2024-06-13 16:31:03,370 DEBUG   SenderThread:5911 [sender.py:send():379] send: run
2024-06-13 16:31:08,371 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:13,372 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:18,374 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:23,377 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:28,378 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:33,380 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:38,382 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:43,384 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:48,386 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:53,388 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:31:58,389 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:03,394 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:07,229 INFO    SenderThread:5911 [retry.py:__call__():172] Retry attempt failed:
Traceback (most recent call last):
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/connection.py", line 95, in create_connection
    raise err
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/connection.py", line 85, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connection.py", line 179, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/lib/retry.py", line 131, in __call__
    result = self._call_fn(*args, **kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/internal/internal_api.py", line 340, in execute
    return self.client.execute(*args, **kwargs)  # type: ignore
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute
    result = self._get_result(document, *args, **kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result
    return self.transport.execute(document, *args, **kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/wandb/sdk/lib/gql_request.py", line 58, in execute
    request = self.session.post(self.url, **post_args)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/data/puchun.liu/anaconda3/envs/STEP/lib/python3.9/site-packages/requests/adapters.py", line 507, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7fe9c000de50>, 'Connection to api.wandb.ai timed out. (connect timeout=20)'))
2024-06-13 16:32:08,396 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:13,398 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:18,400 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:23,401 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:28,403 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: keepalive
2024-06-13 16:32:33,405 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: cancel
2024-06-13 16:32:33,405 DEBUG   SenderThread:5911 [sender.py:send():388] Record cancelled: run
2024-06-13 16:32:33,406 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: status_report
2024-06-13 16:32:33,407 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: cancel
2024-06-13 16:32:35,415 DEBUG   HandlerThread:5911 [handler.py:handle_request():158] handle_request: shutdown
2024-06-13 16:32:35,415 INFO    HandlerThread:5911 [handler.py:finish():882] shutting down handler
2024-06-13 16:32:36,407 INFO    SenderThread:5911 [sender.py:finish():1608] shutting down sender
2024-06-13 16:32:36,408 INFO    WriterThread:5911 [datastore.py:close():296] close: /data/puchun.liu/BIOT/wandb/run-20240613_163103-q6udt10e/run-q6udt10e.wandb

Hello, This a reply from our support bot designed to assist you with your Weights & Biases related queries. To reach a human please reply to this message.

‘context’

To reach a human please reply to this message.

-WandBot :robot:

1 Like

It is possible to link wandb.ai but using wandb.init() will timeout the connection

context context context

Hi @daniel2001

Good day and thank you for reaching out to us! Happy to help you on this.

Can you try to relogin on your sdk client using this method:

wandb login --relogin --host=https://api.wandb.ai/

Let me know if this resolves the issue for you.

Thanks,
Paulo

Hi @daniel2001 , since we have not heard back from you we are going to close this request assuming that the issue has now been resolved from your end. If you would like to re-open the conversation, please let us know!