I want to log validation and test losses also during my epochs because they are veeeeery long and I would like to have some insides into this already during the epochs. I’m using the PyTorch Lightning WandbLogger and I tried the following:
construct the Trainer:
Trainer(logger=WandbLogger, val_check_interval=0.1, log_every_n_steps=1)
log in the validation_step()
and training_step()
with
self.log(
"train_loss", # or 'val_loss' in the validation case
loss,
batch_size=self._get_batch_size(train_batch),
prog_bar = True,
on_epoch = True,
on_step = True,
sync_dist = True,
)
The outcome I get in W&B looks like this:
with the dashed line being the val_loss. This was during one epoch with batchsize 10 and 100 datasamples in both validation and training datasets. I thought that global_step is a good metric to plot against. But global step seems to have a weird behaviour.