I want to build a replication package for a paper, where we are using W&B to generate and store our results.
Ideally I would just give people a bash or powershell script which creates the sweeps and creates an agent to run a sweep. In all cases it is sufficient to run a single agent on a single computer.
etc. In all cases the yaml would be a grid with a finite number of iterations before finishing. I guess I could also do things like wandb agent XXXXXXX & to have a child process.
But I am not sure how I get the sweepid returned from the wandb sweep to call wandb agent? Are there any tricks? I guess I can also use the python calls to create a sweep and an agent directly, but in that case I am not sure how to tell it to use a yaml file?
Alternatively, is this the sort of thing that a job queue is best used for? If so, any templates on how to handle that? I guess my shell script would just create all of the jobs for a queue and then a single agent would run it?
In the python script you can load your yml file into a dictionary and use that as the sweep config.
I think you would only have to ignore the “program” key.
@jlperla does the solution of using Python work for you? Here is our documentation on how to define a sweep in Python. If you were to try to programmatically start a sweep via CLI you would probably have to parse stdout from the wandb sweep command to get the sweep id.
Thanks @nathank I decided to hack on the commandline. ChatGPT and a response from W&B led me to the following code, which works on windows git bash and refers to a subfolder called replication_scripts where I keep my sweeps.
PROJECT_NAME="my_project" # swap out globally
run_sweep_and_agent () {
# Set the SWEEP_NAME variable
SWEEP_NAME="$1"
# Run the wandb sweep command and store the output in a temporary file
wandb sweep --project "$PROJECT_NAME" --name "$SWEEP_NAME" "replication_scripts/$SWEEP_NAME.yaml" >temp_output.txt 2>&1
# Extract the sweep ID using awk
SWEEP_ID=$(awk '/wandb agent/{ match($0, /wandb agent (.+)/, arr); print arr[1]; }' temp_output.txt)
# Remove the temporary output file
rm temp_output.txt
# Run the wandb agent command
wandb agent $SWEEP_ID
}
# list of sweeps to call
run_sweep_and_agent "my_sweep_1"
run_sweep_and_agent "my_sweep_2"
I think that simpler regex based setups might replace awk on non-windows platforms, but I would guess this works with minor modifications in other places.
All of this is to say that a new feature to create a sweep and immediately run it would be nice in the wandb sweep command itself. I added a github issue proposing that.
Hi @jlperla, glad you were able to make this work! Yes, I can make a feature request for this. Something like a --start-agent flag you can pass to wandb sweep seems like a good solution. I’ll pass this onto the engineering team and follow up once they have a chance to look into this.
Ah, I see the feature request you made and it looks like Raphael has already captured it. We will follow up on the Github thread once the team has had a chance to work on this.