Artefact upload very slow

boscience · December 2, 2022, 2:44pm

I’m encountering a similar issue to the one reported here: Programmatically accessing artifact object very slow for first call for large artifacts

I’ve found artefacts to be an excellent way for storing the full outputs of my models for later debugging. However , as I’m training information retrieval models my artefacts are rather large (~300MB). I’m only storing titles of my documents but even with that each evaluation example has around 300 titles as an output.

At the end of each WANDB run it takes a couple of hours for the run to sync. I’m running the experiments on GCP VMs so internet speed should not be an issue.

Do you have any ideas on how I could speed up the sync time?

As I’m running multiple experiments sequentially, atm the experiments are blocked by WANDB upload time. I’m thinking as a quick workaround to disable automatic syncing from my scripts and run a wandb sync; sleep loop on a parallel process in the same directory. Does that sound like a reasonable way to go forward?

mohammadbakir · December 2, 2022, 10:46pm

Hi @boscience , thank you for writing in and providing insight/feedback about artifact uploads. Our eng team is prioritizing improvements to artifacts usage and upload workflow that will significantly reduce upload times. These improvements will roll early next year.

In regards to the problem you are facing, wandb writes artifacts through the cache. As files are uploaded or downloaded, which happens asynchronously when you call log_artifact, the upload shouldn’t be blocking your experiments.

Are you using the latest wandb client version?
Are you using artifact.wait() anywhere in your script?
Which methods calls are you using to upload artifacts?
Are the large artifacts single models or model checkpoint versions being constantly uploaded?

boscience · December 5, 2022, 2:03pm

Hi @mohammadbakir,

Are you using the latest wandb client version?

Yes, I’m using Python client, version 0.13.5.

Are you using artifact.wait() anywhere in your script?

No

Which methods calls are you using to upload artifacts?

I’m using only wandb.log statements, as the model only requires a single training step after setup. I make multiple calls to wandb.log with commit=Falseand then a single call with commit=True, that’s the call where I log the artifact, which is a wandb.Table.

Are the large artifacts single models or model checkpoint versions being constantly uploaded?

The large artifact is a wandb.Table.

boscience · December 6, 2022, 9:27am

At the end of my experiments, it hangs at the syncing step:

wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run hopeful-firebrand-60
wandb: ⭐️ View project at https://wandb.ai/boclips/search-eval
wandb: 🚀 View run at https://wandb.ai/boclips/search-eval/runs/2iid9htu

mohammadbakir · December 8, 2022, 11:06pm

Thank you for the update @boscience . Could you provide us the debug.log and debug-internal.log files for the runs that are hanging. They will provide additional clues to what is occurring. Please send them to support@wandb.ai and include my name in the subject line, thank you.

mohammadbakir · December 14, 2022, 6:38pm

Hi @boscience ,since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

system · February 12, 2023, 6:38pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Programmatically accessing artifact object very slow for first call for large artifacts W&B Help	7	939	January 1, 2022
Upload and Syncing of Artifacts are too slow using WSL - MainThread and HandlerThread hanging W&B Help artifacts , wandb	10	1365	January 30, 2023
`wandb.finish()` with `log_model="all"` is very slow when online W&B Help wandb	0	22	April 15, 2025
Slow Artifact Upload W&B Help artifacts , wandb	3	149	May 28, 2024
Force local cache usage W&B Help artifacts	2	489	April 1, 2024

Artefact upload very slow

Related topics