Does a bayes search use info from past sweeps?

I often perform several sweeps,because of several reasons (require computational resources, crashes, adjust hyperparameter ranges, etc)

If I start another bayes sweep, will it take the results of runs from the already finished or stopped sweeps into account? Do I need to narrow down the ranges based on my interpretation of the results from the last sweep, or will the bayes search automatically do that?

Also: does the run_cap influence how soon the search will switch from exploration to optimization?

hey @tim-kuipers - if you start up a new sweep, this would be independent of the previous. I will look into seeing if there’s a way to circumvent this or add a condition to prevent duplicate parameters from being tried on the new sweep, but I would recommend narrowing the ranges in the new sweep for now.

run_cap solely controls the maximum number of runs per sweep

Thanks for your help.

I’m not really concerned with duplicate parameters being tried out; my concern is that subsequent bayes searches don’t learn from previous results. Doing more sweeps doesn’t make the result any better, because a new sweep doesn’t take the results of the last one into account.

I thought that bayes search starts out with doing more exploration, but toward the run_cap it would switch to just trying to get the optimum. However, looking at the docs again I seem to have been mistaken.

hey @tim-kuipers - here is an article that details the bayes hyperparemeter optimization we have set up in our sdk, that is mostly based on this paper right here. please let me know if you have any more questions about this!

Did you ever find this out? Did you make a feature request for this?

The article you just linked doesn’t say which surrogate model is actually used. Is that confidential information?

BOHB needs a couple of random trials before we can fit the surrogate model. Generally people start out with 3 or 5 random trials. How many random runs does wandb do before employing the surrogate model?

Hey @tim-kuipers! Apologies for the delay in response. I’ll take this ticket over since Uma is currently OOO. Unfortunately when you use sweeps for a Hyperparameter Search, the two different sweeps do not learn from each other.

I have submitted a feature request for you for this to be possible^

I have checked internally with our Sweeps team and it takes 2 random runs before employing the surrogate mode:

Hi there, I wanted to follow up on this request. Please let us know if we can be of further assistance or if your issue has been resolved.

Thank you for the information.

You have submitted a feature request. Please you tell me if it has been resolved.

Is it possible to continue a crashed sweep without starting a new one?
Is it possible to adjust the hyperparameter ranges of an existing sweep?
I start my sweeps from code - not from the command line.

No problem! Always happy to help :slight_smile:

Is it possible to continue a crashed sweep without starting a new one?

In this case are you interested in resuming a run that failed as a part of the sweep, or are you more interested in resuming a crashed sweep agent itself?

Is it possible to adjust the hyperparameter ranges of an existing sweep?

Unfortunately editing hyperparameters of a sweep that already exists is not possible.

In that case my feature request is still valid.

I meant resuming a crashed sweep.

I meant resuming a crashed sweep.

Thank you for elaborating, Tim!

Would it be possible for you to send me a link to a crashed sweep you are referring to?

I’m sorry. That wouldn’t be possible, but it’s easy to reproduce: just start a sweep and then force kill the sweep from Task manager / Monitor.

Interesting, I’ve been trying to get the sweep to a crashed state for a bit now, and I keep on having it end up in the “running” state. Could you please confirm that the status of your sweep itself is marked as crashed?

Hi there, I wanted to follow up on this request!

I’m sorry. I think that might just be it. It was quite a while back that I was using wandb intensively. I wouldn’t expect or require wandb to actually show “Crashed” somewhere - I’m just talking about what happens when you crash the application by killing it.

Gotcha! no worries at all, Tim!

When your sweep crashed, but is still marked as running like in a picture above, you are able to go to that sweep’s overview and resume the agent from where you have left off with the command provided in the sweep overview

  • If you are using the bayes algorithm for the sweep, then you can also delete some of your old sweep runs that are crashed/ you are interested in rerunning, and the agent will rerun those values right after.
  • If you are using any other algorithms, you are unable to rerun sweep runs that have already been started.

I am not using the wandb command line tool to start the sweep in the first place. I’m starting it from python code. I don’t think launching the agent will work in my case.


PS:

If you are using the bayes algorithm for the sweep, then you can also delete some of your old sweep runs that are crashed/ you are interested in rerunning, and the agent will rerun those values right after.

Only for Bayes? Are you sure? Seems to me that it only makes sense to rerun runs with some specific configuration when you are using the Grid method. Bayes and Random will have a very low chance to hit some exact configuration anyway.

Hi Tim!

Although we don’t have a direct way of resuming a sweep through a python file, you can resume it using a subprocess. It would look something like this:

import wandb

# 1: Define objective/training function
def objective(config):
    score = config.x**3 + config.y
    return score


def main():
    wandb.init(project="my-first-sweep")
    score = objective(wandb.config)
    wandb.log({"score": score})


# 2: Define the search space
sweep_configuration = {
    "method": "random",
    "metric": {"goal": "minimize", "name": "score"},
    "parameters": {
        "x": {"max": 0.1, "min": 0.01},
        "y": {"values": [1, 3, 7]},
    },
}

#------------------------------------------------------------------
# Resuming Running Sweep
import subprocess

# Replace these with your actual entity, project, and sweep ID
entity = "your_entity"
project = "your_project"
sweep_id = "sweep_id_of_sweep_trying_to_resume"

# Resume the sweep using the CLI command
subprocess.run(["wandb", "sweep", "--resume", f"{entity}/{project}/{sweep_id}"])

wandb.agent(project= project,entity=entity, sweep_id=sweep_id, function=main, count=15)

Only for Bayes? Are you sure? Seems to me that it only makes sense to rerun runs with some specific configuration when you are using the Grid method. Bayes and Random will have a very low chance to hit some exact configuration anyway.

That is a great catch ^, thank you for correcting me, I meant only for Grid method, and miss-typed, my bad!

Thank you so much for putting this much effort into it.

If I understand the code correctly then wandb sweep --resume [sweep_id] doesn’t actually resume a sweep, but just tells the agent that this sweep is a sweep which should be continued - is that correct?

So to simplify, all I need to resume a sweep is the following:
In command line:
wandb sweep --resume [existing_sweep_id]

Then in code, we replace the existing line
wandb.agent(new_sweep_id, ...)
with
wandb.agent(existing_sweep_id, ...)

If that’s just it then it all is a lot easier than I had thought!

Always happy to help!
That’s exactly what we need to do! :slight_smile:

wandb sweep --resume [sweep_id]
You are right! The command above also makes sure that the sweep is currently in the Running state in the UI. From my experience, if it in the running state it has the least amount of troubles being resumed compared to killed or crashed statuses