Hi,
As many have raised before, I encountered the same issue. I am running a sweep agent with around 250 parallel runs writing to the same entity. I tried to reduce the number but it doesn’t seem to help much.
It would be great if you could increase the filestream rate limit.
Thanks!
Kept getting the same thing. For us it’s a single run just with a bunch of images. In the end got my students to fix it for me for free. They said wandb with their wisdom put everything to a single graphQL server so anything higher than a few megs/s will get blocked.
Got a bit annoyed, and my students were all the same so they introduced me to mlop. Here’s the link for those that just wanna have it log some data quickly without losing the metrics to a dumb rate limit, their stuff apparently is api compatible with wandb so import mlop as wandb
should work GitHub - mlop-ai/mlop: Next Generation Experimental Tracking for Machine Learning Operations
1 Like