Has anyone used wandb sweeps and torch.distributed before?

kevin-miao · April 4, 2022, 1:04am

Hi! My first time posting here. One of my code bases uses torch.distributed for distributed training over different GPUs. Currently, I am writing .sh scripts to deal with hyperparameter sweeping. I was wondering if anyone had experience with using wandb sweep functionality to launch sweeps for torch.distributed training scripts.

ramit_goolry · April 4, 2022, 10:12pm

Hi @kevin-miao,

We do not have an example for torch.distributed with Sweeps, but an example for integrating wandb with torch.distributed can be found here. It should be fairly straightforward to extend this example to a sweep.

Please let me know if you face issues with this.

Thanks,
Ramit

system · June 3, 2022, 10:12pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Using wandb sweep with torch.distributed.launch W&B Help sweeps , wandb	6	1394	July 24, 2022
WandB sweeps and ddp W&B Help sweeps , wandb	3	1183	November 5, 2023
Help with running a sweep agent on a multi-gpu machine with pytorch DistributedDataParallel W&B Help sweeps	4	732	January 8, 2025
Notebook/full code for "Hyperparameter Optimization for HuggingFace" W&B Help sweeps	5	997	July 30, 2023
Population Based Training W&B Help sweeps , wandb	2	562	September 19, 2022

Has anyone used wandb sweeps and torch.distributed before?

Related topics