Validation Loss reported as null during sweeps?

Hi there,

I’m encountering the issue of val_loss being reported as null for sweeps, despite val_loss being successfully reported during individual runs.

This issue happens even after creating a custom callback (which was previously found to be a solution by another user). For example:

class CustomWandbCallback(Callback):
def on_epoch_end(self, epoch, logs=None):
if logs is not None:
wandb.log({“val_loss”: logs.get(“val_loss”), “epoch”: epoch})

I wondered if there is a way to fix this? I’ve tried it with and without custom callback with no luck so far:(

Here is my full code (minus the data processing):

import wandb
import pandas as pd
import numpy as np
import tensorflow as tf
import random
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.callbacks import Callback, ModelCheckpoint

Custom callback

class CustomWandbCallback(Callback):
def on_epoch_end(self, epoch, logs=None):
if logs is not None:
# Log ‘val_loss’ and any other metrics you’re interested in
# You can add more metrics to log here as needed
wandb.log({“val_loss”: logs.get(“val_loss”), “epoch”: epoch})

… data processing…

Start a run

wandb.init(project=“Neural Network 1”,
config={
“layer_1”: 128,
“activation_1”: “relu”,
“dropout”: random.uniform(0.01, 0.50),
“layer_2”: 64,
“activation_2”: “relu”,
“layer_output”: y.shape[1],
“optimizer”: “adam”,
“loss”: “mean_squared_error”,
“epoch”: 30,
“batch_size”: 32
})

config = wandb.config

print(‘Project configured successfully’)

Build the model

model = tf.keras.models.Sequential([
tf.keras.layers.Dense(config.layer_1, activation=config.activation_1, input_shape=(X_train.shape[1],)),
tf.keras.layers.Dropout(config.dropout),
tf.keras.layers.Dense(config.layer_2, activation=config.activation_2),
tf.keras.layers.Dense(config.layer_output)
])

model.compile(optimizer=config.optimizer, loss=config.loss)

print(‘Model built successfully’)

model_checkpoint = ModelCheckpoint(
filepath=“models/best_model.h5”, # Path where to save the model
save_best_only=True, # Save only the best model
monitor=‘val_loss’, # Criterion to monitor
mode=‘min’, # The smaller the monitored quantity, the better the model
verbose=1 # Log when the best model is updated
)

history = model.fit(x=X_train, y=y_train,
epochs=config.epoch,
batch_size=config.batch_size,
validation_data=(X_test, y_test),
callbacks=[
CustomWandbCallback(),
model_checkpoint # Use the modified ModelCheckpoint
])

wandb.finish()
print(‘Finished’)

hey @katarinaa - would it be possible to send over a short snippet for how you’re setting up your sweep? (how you’re initializing your agent, defining your training function with the above snippet, etc)

Hi Katarina, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, feel free to write back in

Sure, here is the snippet I am using to initialise the sweep:

module load bear-apps/2021b/live
module load Python/3.9.6-GCCcore-11.2.0
python --version
python /rds/homes/k/kbp045/PYTHON/waste_data/Neural\ Network\ 1/train.py --activation

wandb agent es-and-d/“Neural Network 1”/[insert sweep ID here]