WandB syncing to local server?

When I run a script which has wandb logging active, it creates a ‘wandb’ folder in the directory where I’m assuming all data related runs is stored.

While running online version, my script would finish in a few minutes, however the uploading of data and it reflecting on the dashboard took a much longer time. (eg - script finished running 10k iterations, however syncing/uploading still going on, and dashboard still not caught up and displaying datat till 3k iterations, and increasing very slowly ).
Assuming the ‘upload’ of data was the bottleneck, I set up and ran a locally hosted server, and was surprised to see the exact same behaviour here as well…
So is there any way to make the locally hosted server directly read from these ‘wandb’ folderexisting locally instead of syncing with the local server again so that the dashboard updates quicker…?
Coming from tensorboard, is there any tool/way to directly view the dashboard locally that loads in files from the local ‘wandb’ directory? (somthing along the lines of tensorboard --logdir=)

(I would love to use wandb as an other in-house alternative to tensorboard, since wandb.log is magnitude faster than summaryWrite/add_scalar, however the dashboard syncing of tensorboard is way faster than that of wandb, or rather dashboard syncing of wands is absurdly slower for the small and light use cases I’m using it for…even on locally hosted servers : ( )

Hi @mohitjavale, thank you for reaching out with your questions.

Currently, it is not possible to have the locally hosted server read directly from the wandb run folders however, the behaviour you are describing with the slowness syncing to the Dashboard is not the expected behaviour.

Would you mind sharing some additional information such as:

  • What version of the wandb SDK you have installed?
  • A snippet of code showing what you type of data you are logging to W&B
  • The URL for the workspace when you tried logging to the W&B cloud platform
  • The debug.log and debug-internal.logyou can find in the./wandb/run__/logs folder

This info would help us investigate why the syncing of the data is taking longer to sync both on the cloud and locally

Hey @fmamberti-wandb, thanks for the quick reply.

After a bit more experimentation and digging around, I’ve come to realise that this slow syncing is a problem is specific to the test-script I was running (or generally I suppose when one is logging data when 1 iteration of the loop finishes in less than a certain threshold, somehwere around 0.002 seconds I think, which I’m guessing is about the time the sync takes to send data over 1 iteration)

For other actuall applications during net training and stuff that I tried later (iteration times greater than 0.002 secs usually), happy to say this is not an issue at all :slight_smile: .


Either way, this is the additional info for recreation in case it helps -

  • wandb version - 0.16.4 (obtained using ‘pip list’)

  • Code Snippet - (Just trying out wands using some dummy code : )

  • Results -

    • Loop execution ~ 10 sec ; Code execution ~ 143 sec later (after loop, only wandb.finish() is called. Assuming the syncing of data continues for these 143 sec, since the dashboard is also only gradually updated over these 143 seconds)
    • With ‘wandb offline’ - Loop execution ~ 10 sec ; Code execution ~ 24 sec later (only wandb.finish() takes 24 sec, even with syncing completely disabled :confused: )
  • Comparison to Tensorboard - Loop Execution ~ 22 sec. Code execution ends asap after loop execution. (writer.add_scalar() is 10x slower than wandsb.log(), accounting for the longer Loop execution time confirmed.)

  • Local-hosted server vs Cloud server? - Surprisngly not much sync lag difference in local vs cloud sever. Uploading of data doesn’t take place (bottleneck in case of slow networks). Won’t affect many, but for the few ultra efficient, dashboard page loads and refreshes quicker than in the cloud version (1 sec instead of 3 sec) XD.


TL;DR anyone else reffering this thread -
wandb syncing won’t lag after your code execution as long as 1 iteration of your loop takes longer than ~0.002 seconds(usually only happens while trying out wandb for the first time and using dummy data :slight_smile: ).

(Would still love if the local server could directly read from the directoy for the few of the smaller scale experiments, maybe a future feature, swithching to wandb anyways :heart: )

HI @mohitjavale, It’s great to know that you are not experiencing latency during training and thank you for providing all the details to reproduce the issue you originally had.

I will raise your request to visualize an experiment locally loading files from W&B before the data is synced with the server as a Feature Request for our Product team.