Hi W&B team,
I’m training several models concurrently and am encountering the following error:
wandb: 429 encountered (Filestream rate limit exceeded, retrying in 2.1 seconds.), retrying request
May I kindly request to increase my user rate limit?
Also, I noticed that on our compute cluster, I am running 8 jobs (e.g., checked via squeue
), but only six runs appear active in the W&B dashboard. For example, I identified a job, with run name 00:56:03 on 06/30/2024 (29437)
, that produces logs that confirm that the job is running. The SLURM job ID is 38009820
, and I can use squeue
to confirm that the job is actively running with runtime 16:31:20
. However, I don’t see this run on my W&B dashboard. May you please advise how to resolve this issue?
Thanks!