Wandb sweep and PyTorchLightning CLI

[[Francais en dessous]]
Hi !
I use Pytorch LightningCLI and YAML configurations for my training sessions. Here is a simple example of the model configuration I use: more information on Lightning’s YAML configuration: Configure hyperparameters from the CLI (Advanced) — PyTorch Lightning 2.4.0 documentation

LightningCLI Yaml model Configuration
model:
  class_path: models.zoo.LitGCN
  init_args:
    hidden_channels: 128
    gcn_num_layer: 4
    mlp_num_layer: 2
    dropout: 0.2
    learning_rate: 0.001

I’d like to use wandb Sweep to find the best hyper-parameters for my model, but so far the tests I’ve run haven’t worked. Here’s an example of the sweep configuration I’ve tested:

Sweep Configuration
program: training.py
method: grid
metric:
  goal: minimize
  name: val/loss.min
parameters:
  seed_everything:
    distribution: int_uniform
    max: 10
    min: 0
  model:
    class_path:
      ditribution: categorical
      values:
        - models.zoo.LitGCN
    init_args:
      hidden_channels:
        distribution: q_log_uniform
        min: 32
        max: 256
        q: 2
      gcn_num_layer:
        distribution: int_uniform
        min: 1
        max: 8
      mlp_num_layer:
        distribution: int_uniform
        min: 1
        max: 8
      dropout:
        distribution: q_uniform
        min: 0.0
        max: 0.5
        q: 0.1
      learning_rate:
        distribution: q_log_uniform
        min: 1e-4
        max: 0.1
        q: 10

command:
  - .venv/bin/python
  - ${program}
  - fit
  - -c=configs/data/sulcalgraphs_32_all-features.yaml
  - -c=configs/trainers/usual.yaml
  - -c=configs/lr_scheduler/reduce_on_plateau.yaml
  - ${args}

Do you have any tips/recommendations for getting Sweeps to work with PyTorchLightning in this way?

Thank you in advance for your reply and your help!


Bonjour,
Pour mes entrainements j’utilise Pytorch LightningCLI et les configurations YAML.
Voici un exemple simple de configuration de modèle que j’utilise :
plus d’information sur les configuration YAML de lightning : Configure hyperparameters from the CLI (Advanced) — PyTorch Lightning 2.4.0 documentation

LightningCLI Yaml model Configuration
model:
  class_path: models.zoo.LitGCN
  init_args:
    hidden_channels: 128
    gcn_num_layer: 4
    mlp_num_layer: 2
    dropout: 0.2
    learning_rate: 0.001

j’aimerais utiliser wandb Sweep pour trouver les meilleurs hyper-paramètres de mon modèle.

Mais pour l’instant les tests que j’ai effectué ne fonctionnes pas.
Voici un exemple de configuration sweep que j’ai testé :

Sweep Configuration
program: training.py
method: grid
metric:
  goal: minimize
  name: val/loss.min
parameters:
  seed_everything:
    distribution: int_uniform
    max: 10
    min: 0
  model:
    class_path:
      ditribution: categorical
      values:
        - models.zoo.LitGCN
    init_args:
      hidden_channels:
        distribution: q_log_uniform
        min: 32
        max: 256
        q: 2
      gcn_num_layer:
        distribution: int_uniform
        min: 1
        max: 8
      mlp_num_layer:
        distribution: int_uniform
        min: 1
        max: 8
      dropout:
        distribution: q_uniform
        min: 0.0
        max: 0.5
        q: 0.1
      learning_rate:
        distribution: q_log_uniform
        min: 1e-4
        max: 0.1
        q: 10

command:
  - .venv/bin/python
  - ${program}
  - fit
  - -c=configs/data/sulcalgraphs_32_all-features.yaml
  - -c=configs/trainers/usual.yaml
  - -c=configs/lr_scheduler/reduce_on_plateau.yaml
  - ${args}

Avez-vous des indications / recommendantions pour faire fonctionner Sweeps avec PyTorchLightning de cette manière ?

Merci d’avance de votre réponse et de votre aide !

I found the solution here : Multi-level nesting in yaml for sweeps

and I used the ${args_json_file} as parameter

So the resulting file look like :

program: training.py
method: grid
metric:
  goal: minimize
  name: val/loss.min
parameters:
  seed_everything:
    distribution: int_uniform
    max: 10
    min: 0
  model.class_path:
    value: models.zoo.LitGCN
  model.init_args.hidden_channels:
    values: [32, 64, 128, 256]
  model.init_args.gcn_num_layer:
    distribution: int_uniform
    min: 1
    max: 8
  model.init_args.mlp_num_layer:
    distribution: int_uniform
    min: 1
    max: 8
  model.init_args.dropout:
    values: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5]
  model.init_args.learning_rate:
    values: [0.0001, 0.001, 0.01, 0.1]

command:
  - .venv/bin/python
  - ${program}
  - fit
  - -c
  - ${args_json_file}
  - -c=configs/data/sulcalgraphs_32_all-features.yaml
  - -c=configs/trainers/usual.yaml
  - -c=configs/lr_scheduler/reduce_on_plateau.yaml