Using wandb sweep with torch.distributed.launch


I am using wandb sweep to perform hyperparameter tuning.

Basically when I launch wandb agent with “wandb agent <USERNAME/PROJECTNAME/SWEEPID>”,

It will automatically run “/usr/bin/env python --param1=value1 --param2=value2” according to the configurations.

However my code is based on torch distributed data parallel and it has to be launched with torch.distributed.launch rather than just python

How can I tackle this problem?

Hi @rash!

Thanks for writing in. You can change the command that the agent runs by specifying the command structure in your sweep config. Specifically, you can change the interpreter variable to switch to torch.distributed.launch. Here is a link to our docs regarding how this can be done.

Thanks Ramit

I have followed what you suggested but I am still unable to run with torch.distributed.launch.

Below is my configuration yaml file.

method: random



name: total_mean_rank_sum

goal: minimize


  • ${env}

  • torch.distributed.launch

  • ${program}

  • ${args}


#- python

#- python -m torch.distributed.launch --nproc_per_node=4 -m torch.distributed.launch --nproc_per_node=4



min: 0.0

max: 0.01


min: 0.0

max: 0.01


values: ["meanP", "seqLSTM", "seqTransf"]


when I launch an agent , it runs /usr/bin/env torch.distributed.launch --coef_lr=0.0068455254534794605 --lr=0.008759887226936639 --sim_header=seqTransf

what I really need is /usr/bin/env python -m torch.distributed.launch --coef_lr=0.0068455254534794605 --lr=0.008759887226936639 --sim_header=seqTransf

I would like to find some descriptive examples.

Hey @raeh,

The following should work in this case then:

    - ${env}
    - ${interpreter}
    - "-m"
    - "torch.distributed.launch"
    - ${program}
    - ${args}

