I ran a sweep where some possible hyperparameter values were lists. For example, hyperparameter “A” had possible values of , [“1”], [“2”] or [“1”, “2”] . However, these list hyperparameters do not appear in the parameter importance panel. All other hyperparameters are displayed properly.
How can I get these list hyperparameters to appear in the parameter importance panel?
Hi @chiggins ! Thank you for reaching out to us. Let me review this for you and I will get back to you for an update.
Hi @chiggins Good day. If you have setup those parameters in your sweep configuration, then they should be available in your importance panel. In order for me to further assist you on this, I would like to request for the following information that can help me with reviewing your sweep:
- A sample or Toy Code snippet for your sweep
- A screen recording that shows your importance panel with missing hyperparameters
Hi @paulo-sabile
Here is a sample script to reproduce the problem.
main.py
import sys
import wandb
def main():
wandb.init()
cfg = wandb.config
wandb.log({"metric": -cfg["hp1"] + cfg["hp2"] + sum(cfg["hp3"])})
wandb.finish()
if __name__ == "__main__":
wandb.agent(sys.argv[1], function=main, project=sys.argv[2])
config.yaml
program: main.py
project: bug
name: bug-sweep
method: grid
parameters:
hp1:
values: [0, -2, -4, -6, -8, -10]
hp2:
values: [10, 30, 50, 70]
hp3:
values:
- []
- [5]
- [15]
- [5, 15]
Here are the steps to reproduce the problem:
- Login with WandB.
- Create a new project called “bug”.
- Create a new sweep in the project “bug”.
- Run
python3 main.py SWEEP-ID bug
.
- After the sweep completes, create a parameter importance panel.
The parameter importance panel will only display the hyperparameters hp1
and hp2
but not hp3
. However, it should display all three hyperparameters. The hp3
hyperparameter should be treated as a categorical parameter by the parameter importance panel.
Here is a link to the “bug” project I created according to the above procedure.
Here is screenshot of the parameter importance panel on the project. All three hyperparameters are marked visible but only hp1
and hp2
are actually visible.
Sorry about the single image but I am only allowed to upload one. You can see the actually visible parameter importance values in the background.
Hopefully, this is helpful. Please let me know if I can provide additional information.
Hi @paulo-sabile
Here is a script demonstrating the expected behavior.
main.py
import sys
import wandb
def main():
wandb.init()
cfg = wandb.config
wandb.log({"metric": -cfg["hp1"] + cfg["hp2"] + sum(eval(cfg["hp3"]))})
wandb.finish()
if __name__ == "__main__":
wandb.agent(sys.argv[1], function=main, project=sys.argv[2])
config.yaml
program: main.py
project: exp
name: exp-sweep
method: grid
parameters:
hp1:
values: [0, -2, -4, -6, -8, -10]
hp2:
values: [10, 30, 50, 70]
hp3:
values:
- "[]"
- "[5]"
- "[15]"
- "[5, 15]"
Running this new script is similar to the prior script. Here is a link to the project.
Here is a screenshot of the parameter importance panel.
I believe these two examples should produce the same parameter importance panels.
Thank you @chiggins . I was able to reproduce the same results that you are getting. Let me further investigate this and discuss this with our team and I will get back to you for an update.
Thank you @chiggins. I was able to reproduce the same results that you are getting. Let me further investigate this and discuss this with our team and I will get back to you for an update.
Hi @chiggins. Good day! Upon reviewing this, it seems that the issue with the parameter being hidden is rooting from the hp3 being summed when we try to log this to wandb. I tried to log it without summing hp3 just for a test, and the parameter was able to make it on the importance panel.
What i did is I changed it from
sum(eval(cfg[“hp3”]))
into
cfg[“hp3”]
We will further investigate why wandb is producing a different output on the panel when a metric is being “summarized”. But may I also ask your use-case for your experiment? Like why do you need to sum “hp3” when logging it to wandb?
Hi @paulo-sabile. Thank you for taking a look into this! I really appreciate it!
I am only computing sum(eval(cfg["hp3"]))
to provide a minimal reproducible example. It is not representative of my actual experiment. If I logged a random metric, the importance values would be meaningless. Instead, I compute the metric as a function of the hyperparameters (i.e. metric = -cfg["hp1"] + cfg["hp2"] + sum(eval(cfg["hp3"]))
) such that the expected importance and correlation values are obvious. Since the expected values are known, the functionality of the parameter importance panel can be verified.
In a real experiment, hp3
might be the hidden layer sizes of an MLP. Although these hidden layer sizes are not explicitly aggregated, they are implicitly aggregated to compute the metric.
model = MLP(hidden_sizes=cfg["hidden_sizes"])
model.train(Xtr, ytr)
wandb.log({"metric": model.evaluate(Xte, yte)})
From this perspective, every hyperparameter is “summarized” as the metric is always computed from the hyperparameters.
I’m not sure that changing sum(eval(cfg[“hp3”]))
to cfg["hp3"]
solves the problem. There is a type mismatch so the program crashes when computing the metric. This is because cfg["hp1"]
and cfg["hp2"]
are type int
and cfg["hp3"]
is type str
. Even if this worked, it would not resolve the underlying problem. For example, an MLP sweep like I described above would still not work properly with the hyperparameter importance panel.
Hopefully, this is helpful. Let me know if I can provide any additional information!
@chiggins Thank you so much for sharing these detailed explanation! We will further investigate this and I will get back for an update.
@chiggins Thank you so much for sharing these detailed explanation! We will further investigate this and I will get back for an update.
Hi @chiggins
Thank you again for raising this to us. This is now logged with our product team for for a feature request review.
Have a great weekend!