Create both sweep and start an agent for it in shell script

I want to build a replication package for a paper, where we are using W&B to generate and store our results.

Ideally I would just give people a bash or powershell script which creates the sweeps and creates an agent to run a sweep. In all cases it is sufficient to run a single agent on a single computer.

My ideal script would look something like

wandb sweep --name MyExperiment1 sweep_1.yaml
wandb agent  XXXXXXXX
wandb sweep --name MyExperiment2 sweep_2.yaml
wandb agent  YYYYYYYYY

etc. In all cases the yaml would be a grid with a finite number of iterations before finishing. I guess I could also do things like wandb agent XXXXXXX & to have a child process.

But I am not sure how I get the sweepid returned from the wandb sweep to call wandb agent? Are there any tricks? I guess I can also use the python calls to create a sweep and an agent directly, but in that case I am not sure how to tell it to use a yaml file?

Alternatively, is this the sort of thing that a job queue is best used for? If so, any templates on how to handle that? I guess my shell script would just create all of the jobs for a queue and then a single agent would run it?

In the python script you can load your yml file into a dictionary and use that as the sweep config.
I think you would only have to ignore the “program” key.

1 Like

@jlperla does the solution of using Python work for you? Here is our documentation on how to define a sweep in Python. If you were to try to programmatically start a sweep via CLI you would probably have to parse stdout from the wandb sweep command to get the sweep id.

Thank you,
Nate

Thanks @nathank I decided to hack on the commandline. ChatGPT and a response from W&B led me to the following code, which works on windows git bash and refers to a subfolder called replication_scripts where I keep my sweeps.

PROJECT_NAME="my_project" # swap out globally

run_sweep_and_agent () {
  # Set the SWEEP_NAME variable
  SWEEP_NAME="$1"
  
  # Run the wandb sweep command and store the output in a temporary file
  wandb sweep --project "$PROJECT_NAME" --name "$SWEEP_NAME" "replication_scripts/$SWEEP_NAME.yaml" >temp_output.txt 2>&1
  
  # Extract the sweep ID using awk
  SWEEP_ID=$(awk '/wandb agent/{ match($0, /wandb agent (.+)/, arr); print arr[1]; }' temp_output.txt)
  
  # Remove the temporary output file
  rm temp_output.txt
  
  # Run the wandb agent command
  wandb agent $SWEEP_ID
}

# list of sweeps to call
run_sweep_and_agent "my_sweep_1"
run_sweep_and_agent "my_sweep_2"

I think that simpler regex based setups might replace awk on non-windows platforms, but I would guess this works with minor modifications in other places.

All of this is to say that a new feature to create a sweep and immediately run it would be nice in the wandb sweep command itself. I added a github issue proposing that.

Hi @jlperla, glad you were able to make this work! Yes, I can make a feature request for this. Something like a --start-agent flag you can pass to wandb sweep seems like a good solution. I’ll pass this onto the engineering team and follow up once they have a chance to look into this.

Ah, I see the feature request you made and it looks like Raphael has already captured it. We will follow up on the Github thread once the team has had a chance to work on this.

1 Like