"None" in not a permitted value of the categorical hyperparameter in the current sweep

I am receiving this (new?) error in running a bayesian sweep:

400 response executing GraphQL.                                                                          
{"errors":[{"message":"None is not a permitted value of the categorical hyperparameter algo.train_config.
loss.weight in the current sweep.","path":["agentHeartbeat"]}],"data":{"agentHeartbeat":null}}           
wandb: ERROR Error while calling W&B API: None is not a permitted value of the categorical hyperparameter
 algo.train_config.loss.weight in the current sweep. (<Response [400]>)  

As far as I can tell, this sweep has successfully launched runs with algo.train_config.loss.weight=None previously. I can link to them if desired.

Here is the config of the sweep:

command:
  - /home/ahalev/miniconda3/envs/eye-image-env/bin/python
  - ${program}
  - ${args}
method: bayes
metric:
  goal: maximize
  name: Evaluate/EMA/balanced_accuracy_T-T_best
  target: 1
parameters:
  algo.model.architecture:
    distribution: categorical
    values:
      - inception_v3
      - vit_base_patch16_224
      - inception_resnet_v2
      - retfound_vit
  algo.train_config.loss.weight:
    distribution: categorical
    values:
      - null
      - reciprocal
      - reciprocal_squared
  algo.train_config.optimizer.lr:
    distribution: log_uniform_values
    max: 0.01
    min: 1e-06
  algo.train_config.optimizer.opt:
    distribution: categorical
    values:
      - sgd
      - adamw
  algo.train_config.optimizer.weight_decay:
    distribution: log_uniform_values
    max: 0.001
    min: 1e-09
  algo.train_config.scheduler.type:
    distribution: categorical
    values:
      - cosine
      - cyclic_triangular
      - cyclic_triangular2
      - cyclic_exp_range
  algo.train_config.train_layers:
    distribution: categorical
    values:
      - all
      - 2
      - 0.5
  dataset.images.train.eye:
    value: both
  dataset.images.train.side:
    value: both
  dataset.retinal_genotype.gene:
    value: arms2
  preprocess.add_gaussian_noise_sigma:
    distribution: uniform
    max: 1
    min: 0
  preprocess.gaussian_laplace_sigma:
    distribution: uniform
    max: 1
    min: 0
  preprocess.random_flip_probability.horizontal:
    distribution: categorical
    values:
      - 0
      - 0.5
  preprocess.random_flip_probability.vertical:
    distribution: categorical
    values:
      - 0
      - 0.5
  preprocess.resize_crop.ratio:
    distribution: categorical
    values:
      - - 0
      - - 0.75
        - 1.33
  preprocess.resize_crop.scale:
    distribution: categorical
    values:
      - - 0
      - - 0.6
        - 1
  project:
    value: retinal_genotype
program: ../trainer.py
project: retinal_genotype

Any assistance would be appreciated.

Log files:

https://github.com/wandb/wandb/files/14015971/debug.log
https://github.com/wandb/wandb/files/14015970/debug-internal.log

hey @ahalev - few questions to help me dig into this:

  • I tried to reproduce this behavior on my end but have been unable to on both SDK version 0.16.0 and 0.16.2. would it be possible to send me a code snippet of your training script so I could try to mimic the behavior you’re seeing?
  • you mentioned that this sweep worked previously - do you still encounter this behavior when you run this in a fresh environment?
  • i took a look at the debug logs and they look clean - if you are able to set the WANDB_DEBUG env var to True and re-run the experiment, this would give me more verbose debug logs to investigate further

thanks a ton!

Hi @ahalev, we wanted to follow up with you regarding your request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

hey @ahalev - since we have not heard back from you we are going to close this thread. If you would like to re-open the conversation, please let us know!