Network error (HTTPError), entering retry loop

qx66 · May 19, 2024, 8:41pm

Dear Weights & Biases,

I kept getting crashes recently, particularly when starting runs and logging data. The error messages are:
wandb: Network error (HTTPError), entering retry loop
wandb: 429 encountered (Filestraem rate limit exceeded, retrying in 32.7 seconds.), retrying requests

Could you help me figure out the issues and the solutions? Thanks!

anmolmann · May 22, 2024, 10:15pm

Hey @qx66 , the errors you’re encountering, particularly the HTTPError and 429 encountered (Filestraem rate limit exceeded, retrying in 32.7 seconds) , indicate that your runs are hitting the rate limits set by Weights & Biases (W&B). The specific error message 429 encountered (Filestraem rate limit exceeded) indicates that you are making more requests than the allowed rate limit for file streams. This is causing your runs to be throttled, leading to delays and potentially causing crashes if the retry logic is overwhelmed.

Could you share your wandb username so that we can verify what your current rate limits are?
What is your use-case? If you could more context regarding what you are trying to do in your training script, that would be really helpful for us to investigate this further and recommend best practices accordingly.

Consideration:

You could reduce the frequency of logging to avoid hitting rate limits. Instead of logging every step, aggregate data and log it less frequently. For example, log every epoch or every few steps. More info and examples on this can be found in our docs.

qx66 · May 24, 2024, 7:48pm

Thanks for your reply!

I have also figured out the reason myself. It seems that there was a period where the jobs shown to be crashed/failed on wandb were still running on SLURM. I didn’t notice this, then I submitted more and more batches of jobs, and reached the file limit of wandb. After I realized this, I manually cancelled some SLURM jobs which don’t match with wandb and the situation was better.

anmolmann · May 28, 2024, 4:32pm

Thanks for the update, @qx66 . Do you have any other queries for us?

anmolmann · May 29, 2024, 5:17pm

Hi Qian, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

Topic		Replies	Views
Rate Limit Exceeded wandb increase rate limit W&B Help	1	494	February 8, 2023
Wandb: 429 encountered (Filestream rate limit exceeded), retrying request W&B Help wandb	2	34	August 19, 2024
Wandb: 429 encountered W&B Help	3	35	August 20, 2024
Wandb: 429 encountered (Filestream rate limit exceeded, retrying in 73.2 seconds.), retrying request W&B Help wandb	3	89	August 19, 2024
429 encountered (Filestream rate limit exceeded, retrying in 10.0 seconds.) W&B Help	5	117	August 20, 2024

Network error (HTTPError), entering retry loop

Related topics