Hyperparameter tuning combined with k-fold cross validation

I found this official example showcasing an implementation of k-fold cross validation using Sweeps. However, I am doing hyperparameter tuning with sweeps so I am coming from a different angle: I want to do k-fold CV for one given set of parametersr for each sweep run. I would imagine that there should be sub-groups for each CV-group in the sweep view of the web interface.

Is this possible to do with Wandb or should I look elsewhere?

Thank you!

EDIT: the rationale behind it is to prevent optimizing hyper parameters to overfit the test set. If you have another means to reach this goal, I am open for it.


Thank you for contacting us! Yes, it is possible to perform k-fold cross-validation for a given set of hyperparameters with Wandb Sweeps. In fact, the example you found is a good starting point for implementing k-fold cross-validation in your own Sweep runs.

Thank you for your reply. Maybe I misunderstood, but as I understand it, the example that I found does not perform hyperparameter tuning at all and instead just performs k-fold CV using Sweep runs. I.e., the example uses one sweep run for each fold resulting in a total of k runs. Can you confirm that?

Additionally, the example uses multiprocessing for some reason while at the same time joining each of the created processes immediately (see here). In my understanding, that means that each process runs after the previous one (no parallelism) and the usage of threading seems to be due to other non-obvious reasons.

The problem for me seems to be that with a given set of hyperparameters, each call to wandb.init() refers to the very same run internally. So if I want to loop over the folds using the same hyperparameters, Wandb ends up overwriting the previous runs/folds every time.

Here is a minimal working example:

import wandb
import wandb.sdk
import randomname
import numpy as np
from sklearn.model_selection import KFold

    "method": "random",
    "name": "my_config",
    "metric": {"goal": "minimize", "name": "val_root_mean_squared_error"},
    "parameters": {
        "param1": {"values": [8, 16, 32]},
        "param2": {"values": [1, 2, 4]},

class Experiment:
    def __init__(self) -> None:
        self.x_train = np.random.random((2048, 3, 1))
        self.y_train = np.random.random((2048, 1))

    def train(self) -> None:
        kf = KFold(n_splits=4, shuffle=True)
        cv_name = randomname.get_name()
        for fold, (ix_train, ix_val) in enumerate(kf.split(self.x_train)):
            x_fold_train, y_fold_train = self.x_train[ix_train], self.y_train[ix_train]
            x_fold_val, y_fold_val = self.x_train[ix_val], self.y_train[ix_val]

            run_name = f"{cv_name}-{fold:02}"
            run = wandb.init(group=f"cv_{cv_name}", name=run_name, reinit=True)
            assert run is not None
            assert type(run) is wandb.sdk.wandb_run.Run
            wandb.summary["cv_fold"] = fold
            wandb.summary["num_cv_folds"] = kf.n_splits
            wandb.summary["cv_random_state"] = kf.random_state

            param1 = wandb.config.param1
            param2 = wandb.config.param2
            # random result for MWE
            rmse = param1 * np.mean(y_fold_train) + param2 * np.mean(y_fold_val)
            score = rmse
            wandb.log({"val_root_mean_squared_error": score})

if __name__ == "__main__":
    exp = Experiment()

    sweep_id = wandb.sweep(sweep=SWEEP_CONFIG, project="my_proj")
        # count=40,

Could you point out what needs to change in this example for it to work?

EDIT: Obviously I could just NOT log the training to Wandb and instead only return the average result score for all folds - however, this is not what I want. I want to be able to compare the loss graphs of different folds etc.

I just want to make sure I understand correctly: Even though you said that this should be possible to do, I have shown in my MVP that it does not work. In my understanding this means that you did misunderstand what I meant (or didn’t know) and it is indeed not possible to perform hyperparameter tuning in addition to k-fold CV with W&B and I will have to look somewhere else. Could you confirm this? Please let me know.


Thank you for your patience! Please give me a couple of minutes let me get back to you.

I’m also looking for ways to do this. As @mbp wrote, it would be nice to have metric curves per run and then be able to group these per fold. Now, if you just create a new run per every fold, it gets overwritten next configuration sweep.

1 Like

Hey @MBP,
Thank you very much for your patience!
I mentioned that the example you found was a good starting point for implementing k-fold cross validation in your sweep runs. Meaning it was a good starting point to add the sweep configurations to launch the agents.
The example you provided do run one sweep for each fold.
It seems like I didn’t get to understand what you wanted. Are you trying to parallelize sweep agents in such a way that each sweep agent performs k-fold CV?

Hi @bill-morrisson, I am not sure what more I can do to explain this. I have even added source code above so that you can run it yourself and see the issue. I will try to rephrase it:

I want to run a hyperparameter study with W&B and I want to use Sweeps for it. This is well-documented and works on its own. For each run I will receive a set of parameters from run.config. So far so good!

Now I want to use one run and the parameters from this particular run and I want to perform k-fold cross validation with these parameters. That would be easy - I just need to run the training in a loop, and train one model for each fold, right?

But now, I want to log all those runs to W&B as well! How to do that? It seems it’s not possible because when I use the wandb.log and other functions in the loop, the previous value will just be overwritten. This is the problem I want to solve - how to do this without overwriting the previous values?

Additionally, one could think that if I run wandb.init once for each fold, then maybe the wandb.log will not be overwritten and instead the folds will be logged. Alas the values will still be overwritten. A new run will be created in W&B but the previous run will just disappear. For example the first run/fold is called mysterious-sweep-1 and then the second fold will start, a new run with the name epic-sweep-2. And now, the run mysterious-sweep-1 is disappeared completely and all previous logged values are overwritten.

I hope this helps to clarify.

I understand your question.
I jsut want to point oout, I’m having the same issue. Previous runs getting overwritten. Im ending up with only 3 run per job, not 1 per kFold.
When I run it with kfold not within a sweep agent, it works as inteden (grouped in the UI, all there)
Anyone any idea?

I have the same problem where multiple runs for the same group, each run for a different fold, are all overwritten into 1 run for the entire k-folds.
It means that all the information of the runs not including the last fold run is lost.

I think we don’t get any official replies because either this is

  • not possible and therefore we wait in vain, or
  • it is so simple and obvious that we are being ignored.

I hope it’s the latter and we can find out how to do it ourselves. :sweat_smile:

Hi @MBP,

Sorry for the time taken to get back to you. We haven’t yet dug into it specifically.
We’ll be looking into it with our engineering team and let you know.

I have posted an github issue. I think they are referring to that.

Should help additionally to reproduce and find error

1 Like

Hi @magenbrot , @mbp : I’ve left a reply here. Hoping to revive the conversation in the GitHub thread.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.