Distributed data parallel with pytorch lightning

Hi there,

I’m running over multiple GPUs using lightning’s DDP wrapper, and each GPU thread creates a new experiment on wandb. Is it possible to just track the rank 0 process? I can see how to do this in just vanilla pytorch, but not while using the lightning wandblogger.

Thanks!

Hi @chris-pedersen, thank you for reaching out with your question. While tracking only the rank zero process is not currently possible, when using WandbLogger with multiple GPUs, only one Runs should be created in W&B.

Would you mind sharing the URL for the workspace you see multiple Runs being created, as well as the versions of PTL and wandb you are currently running and a snippet of code showing how the WandbLogger and ptl Trainer are being configured?

Thanks,
Francesco

Hi @chris-pedersen , I wanted to follow up on this request. Please let us know if we can be of further assistance or if your issue has been resolved.

Hi @chris-pedersen, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!