Has anyone used wandb sweeps and torch.distributed before?

Hi! My first time posting here. One of my code bases uses torch.distributed for distributed training over different GPUs. Currently, I am writing .sh scripts to deal with hyperparameter sweeping. I was wondering if anyone had experience with using wandb sweep functionality to launch sweeps for torch.distributed training scripts.

Hi @kevin-miao,

We do not have an example for torch.distributed with Sweeps, but an example for integrating wandb with torch.distributed can be found here. It should be fairly straightforward to extend this example to a sweep.

Please let me know if you face issues with this.


This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.