How can I update custom plots in real-time?

I’m currently using a custom line-plot to plot some metrics on one graph. Here’s an example plot that tracks train and valid loss over a number of epochs:

image

Here’s my current procedure to create this plot:

  1. First I create an empty W&B Table (let’s call it loss_table)
  2. At the end of each epoch, I calculate the train and valid loss and add it to the Table with the loss_table.add_data() method.
  3. Then at the end of training, I log loss_table to W&B.
  4. Finally I create the chart from a vega spec spec_name with the command chart = wandb.plot_table(vega_spec_name=spec_name, data_table=loss_table,...) and log the chart with
    wandb.log({"loss vs epoch": chart})

This gives me the chart but doesn’t let me see how the metrics change in real-time. Given that my training times are long (in the order of days and weeks) it is pretty important to me to see this real-time.

My main problem is that W&B doesn’t support updating rows of a table, but instead supports only one upload of a table. This prevents updating metrics and logging them to W&B after each epoch, instead pushing for uploading the table only once training is finished. There is an Github issue that talks about this.

Any suggestions would be welcomed. Not sure how to get around this.

Hello! This flow might not be ideal (and we’re working on it), but the current best thing to do is probably what Carey’s suggesting in the GitHub issue:

Hi there, we can’t support adding new rows to existing tables that you’ve already logged, but here are two approaches:

  1. Keep the wandb.Table locally and add new rows to it. Once you’ve got all the rows in, call wandb.log
  2. Keep logging the same table at each step, and just add new rows to it. the final table you log will have all the rows you want, and you’ll be able to see the latest table logged in the UI. This would be risky if you have large table sizes.

Let me know if you have more questions/feature suggestions for the product though. :slightly_smiling_face:

1 Like

Thanks for your reply.

  1. Keep logging the same table at each step, and just add new rows to it. the final table you log will have all the rows you want, and you’ll be able to see the latest table logged in the UI. This would be risky if you have large table sizes.

Sounds like this would be the option. If I did this, is there a way I could delete the old table after I create the new one (preferably via a Python command)?

Sure thing, happy to help! Well, because of how Artifacts work all the files within your Tables (even if there are many, many versions of them) wouldn’t be duplicated and the actual files defining Tables are usually really small (like several kilobytes or, tops, megabytes if your Table is really large), so you could easily keep all the versions of your Tables this way.

In terms of deleting the older versions of those Tables (which, again, probably wouldn’t make much of a difference in terms of storage) you could do in the UI by clicking here, and in Python using this code. Here it’s in the docs.

for artifact in run.logged_artifacts():
    artifact.delete(delete_aliases=True)

1 Like

Also, here’s some more info on how Artifacts work https://docs.wandb.ai/guides/artifacts/artifacts-core-concepts#storage-layout

Thanks for your help! I will give it a go.