WandB correctly logs media (videos, images, etc.) both locally and remotely until the first model artifact is saved. After that, media files are still saved locally but do not appear in the remote media folder in WandB.
I initialize wandb logger using pytorch lightning like this.
Using wandb == 0.13.9. Issue persists even after upgrading wandb
logger = WandbLogger(
project="my_project",
name=f"{cfg.TAG}_{time.strftime('%d%B%Yat%H:%M:%S')}",
save_dir=save_dir,
log_model='all'
)
callbacks = [
pl.callbacks.ModelSummary(-1),
pl.callbacks.LearningRateMonitor(),
ModelCheckpoint(
save_dir,
every_n_train_steps=int(cfg.VAL_CHECK_INTERVAL),
filename="{epoch}-{step}",
save_top_k=-1 # Save all checkpoints
),
]
trainer = pl.Trainer(
gpus=cfg.GPUS,
accelerator='gpu',
strategy='ddp',
precision=cfg.PRECISION,
sync_batchnorm=True,
max_epochs=None,
max_steps=cfg.STEPS,
callbacks=callbacks,
logger=logger,
log_every_n_steps=cfg.LOGGING_INTERVAL,
val_check_interval=cfg.VAL_CHECK_INTERVAL,
limit_val_batches=limit_val_batches,
replace_sampler_ddp=replace_sampler_ddp,
accumulate_grad_batches=cfg.OPTIMIZER.ACCUMULATE_GRAD_BATCHES,
num_sanity_val_steps=0,
)
trainer.fit(model, datamodule=data)
I log metrics and video like this
self.logger.experiment.log({
name: wandb.Video(video_np, fps=2, format="gif"),
"global_step": self.global_step
})
self.logger.experiment.log({
'val_acc': val_acc,
'val_prec': val_prec,
'val_rec': val_rec,
'val_f1': val_f1,
'val_auroc': val_auroc
}, step=self.global_step)
self.logger.experiment.log({'val_iou_' + key: value, 'global_step': self.global_step})
self.logger.experiment.log({'val_mean_iou': torch.mean(scores), 'global_step': self.global_step})