Log multiple variables at the same plot - multi gpu version

I have a multi gpu training process. For each gpu,at each step I want to be able to log a certain metric conveniently, so all the graphs (one for each gpu) would be presented on the same chart.
The only answer addressing the problem was this Log multiple variables at the same plot. However it is less convenient when we have changing number of gpu on each training. Is there a more flexible way to display this information?

Hello @xenia-kra !

Since you want to log from each process, I believe you would liked to log using many processes (as outlined here in our docs). The only difference between the docs and your situation is that you would not need to group in the UI since each GPU would be logging separately.

@raphael-sanandres thank you for your comment.
However Im still not sure how should i configure this properly.
Right now, we are initializing a single wandb connector, which we use under the master process only. If I send logs outside the master condition, I can see multiple runs in wandb console, which unfortunately not what Im looking for. I would like to be able to see the chart comprising multiple metrics graphs (for each process) under the same run (master only). Is it possible? Which configuration is responsible for it? Thanks

You would have two options.

  1. One Process

    • Right now, you are logging via one master process (which is this method). By logging via one process, you will be seeing a single graph that the rank0 process would be logging. However, in the UI only one run will log to the chart.
  2. Many processes

    • You can log each run to the wandb UI and add a group in the wandb.init() [For example: run = wandb.init( entity=args.entity, project=args.project, group="DDP")]. From here, you can add add a grouping to the runs to aggregate the runs of a single group.

However, we do not support graphing multiple, different processes from the same run.

Hi Xenia, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.