Wandb sweep not working

I have been using wandb sweeps for a long time, but I am getting a sweep bug now that I have not seen before.
This is my yaml file:

Program to run

program: main_minatar.py

project name

project: jax_meta

Method

method: grid

metric to optimize

metric:
name: average_reward
goal: maximize

Hyperparameters

parameters:
ENV_NAME:
values: [“Breakout-MinAtar”, “SpaceInvaders-MinAtar”, “Freeway-MinAtar”, “Asterix-MinAtar”]
ACTIVATIONS:
values: [“11111111_+00”, “01010101_+00”, “02020202_+00”, “03030303_+00”, “04040404_+00”,
“14141414_+00”, “04040404_+02”, “14141414_+02”, “04040404_+05”, “14141414_+05”,
“04040404_+10”, “14141414_+10”, “04040404_-02”, “14141414_-02”, “01040104_+05”,
“01140114_+05”, “01040101_+05”, “01010104_+05”, “01040104_+02”, “01140114_+02”,
“01040101_+02”, “01010104_+02”, “14141414_+15”, “14141414_-10”, “01010404_+05”,
“04010401_+05”, “01010404_+02”, “04010401_+02”]
TOTAL_TIMESTEPS:
values: [1e7, 2e7]

For some reason, it keeps sweeping the first 2 activation values: “11111111_+00”, “01010101_+00”, all the time.
For example, it is now running TOTAL_TIMESTEPS: 1e7, Freeway-MinAtar, “11111111_+00” for the 4th time already.

The logging is done here after training a RL agent:

data = outs[“metrics”][“returned_episode_returns”][0].mean(0).mean(-1).reshape(-1)
chunk_size = 500 # Or 1000, depending on your preference

Calculate number of chunks

num_chunks = len(data) // chunk_size
time_per_chunk = args.TOTAL_TIMESTEPS / num_chunks

Logging

for i in range(num_chunks + 1):
start_idx = i * chunk_size
end_idx = start_idx + chunk_size
chunk = data[start_idx:end_idx]

# Compute summary statistics for the chunk
mean = np.mean(chunk)
std = np.std(chunk)
min_val = np.min(chunk)
max_val = np.max(chunk)

# Log summary statistics to wandb
wandb.log({
    "returns_mean": mean,
    "returns_std" : std,
    "global_step": i*time_per_chunk
})

Does anyone know what could be the problem? (Ubuntu 22.04, Wandb 0.16.4)

-Update: It is still running, only sweeping the first two entries of ACTIVATIONS for an infinite amount of times.

Hi @jkooi23, could you possibly send me a link to the sweep and I’ll take a look?

That certainly looks like unexpected behavior since you are using a grid search.

Thank you,
Nate

Hi @nathank,

Thanks for taking the time. I deleted that old sweep but just created a new one to recreate the problems: It is still running certain activations over and over.

Please let me know if you need more information.

Hi @jkooi23, sorry for the delay on this. I’ve gone ahead and reported this to the engineering team since your sweep config looks correct and shouldn’t be just sweeping over those first 3 activation values. I’ll be able to follow up once I have an update from the team

I think I’m having the same issue when running a sweep via W&B. For me, W&B starts the same run multiple times. They are then also logged to the same run on W&B, causing that the logs are a mix of the last and the current run. See this screenshot, where the relative Wall time is suddenly negative?
Screenshot from 2024-04-18 09-47-54
I used W&B 0.16.3

Hi @claushofmann , I wanted to inform you this is now fixed and will be released with the next version of our sdk. In the event you still encounter and issues, please let us know.