I have been using wandb sweeps for a long time, but I am getting a sweep bug now that I have not seen before.
This is my yaml file:
Program to run
program: main_minatar.py
project name
project: jax_meta
Method
method: grid
metric to optimize
metric:
name: average_reward
goal: maximizeHyperparameters
parameters:
ENV_NAME:
values: [“Breakout-MinAtar”, “SpaceInvaders-MinAtar”, “Freeway-MinAtar”, “Asterix-MinAtar”]
ACTIVATIONS:
values: [“11111111_+00”, “01010101_+00”, “02020202_+00”, “03030303_+00”, “04040404_+00”,
“14141414_+00”, “04040404_+02”, “14141414_+02”, “04040404_+05”, “14141414_+05”,
“04040404_+10”, “14141414_+10”, “04040404_-02”, “14141414_-02”, “01040104_+05”,
“01140114_+05”, “01040101_+05”, “01010104_+05”, “01040104_+02”, “01140114_+02”,
“01040101_+02”, “01010104_+02”, “14141414_+15”, “14141414_-10”, “01010404_+05”,
“04010401_+05”, “01010404_+02”, “04010401_+02”]
TOTAL_TIMESTEPS:
values: [1e7, 2e7]
For some reason, it keeps sweeping the first 2 activation values: “11111111_+00”, “01010101_+00”, all the time.
For example, it is now running TOTAL_TIMESTEPS: 1e7, Freeway-MinAtar, “11111111_+00” for the 4th time already.
The logging is done here after training a RL agent:
data = outs[“metrics”][“returned_episode_returns”][0].mean(0).mean(-1).reshape(-1)
chunk_size = 500 # Or 1000, depending on your preferenceCalculate number of chunks
num_chunks = len(data) // chunk_size
time_per_chunk = args.TOTAL_TIMESTEPS / num_chunksLogging
for i in range(num_chunks + 1):
start_idx = i * chunk_size
end_idx = start_idx + chunk_size
chunk = data[start_idx:end_idx]# Compute summary statistics for the chunk mean = np.mean(chunk) std = np.std(chunk) min_val = np.min(chunk) max_val = np.max(chunk) # Log summary statistics to wandb wandb.log({ "returns_mean": mean, "returns_std" : std, "global_step": i*time_per_chunk })
Does anyone know what could be the problem? (Ubuntu 22.04, Wandb 0.16.4)
-Update: It is still running, only sweeping the first two entries of ACTIVATIONS for an infinite amount of times.