Hi, I recently started getting the error message:
wandb: Network error (ConnectionError), entering retry loop. It proceeds for a bit after this, but eventually it reaches a retry loop where it gets stuck permanently.
I am using the wandb.api to download runs from a sweep. When I killed the process forcibly I got a ‘name error’ from urllib3 that a particular instance of sentry.io couldn’t be found.
I also get errors like: 2023-11-27 15:52:38 - ERROR - Error on attempt 1 for run losp15bq: HTTPSConnectionPool(host=‘api.wandb.ai’, port=443): Max retries exceeded with url: /graphql (Caused by NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x2ec022950>: Failed to resolve ‘api.wandb.ai’ ([Errno 8] nodename nor servname provided, or not known)”))
and:
wandb: ERROR Error while calling W&B API: project not found (<Response [404]>)
and
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt:
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x290f61890>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x290f61650>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x290f3de90>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x2c31afb90>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x2ec051d90>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
2023-11-27 15:54:32 - WARNING - Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by ‘NameResolutionError(“<urllib3.connection.HTTPSConnection object at 0x2ec052850>: Failed to resolve ‘o151352.ingest.sentry.io’ ([Errno 8] nodename nor servname provided, or not known)”)’: /api/4504800232407040/envelope/
Are these just transient errors or is there something more fundamental going on? It was working until recently. I have tried both with and without a vpn thinking that the vpn might help the situation if it was something dns related, but it didn’t and led to the exact same errors.
Interestingly enough, during all of this, uploads are fine, it is only downloads that are an issue. I started to cache the run_ids I would need locally to call to download, so perhaps that contributed to the error.
Let me know if you need any debug info.