I’ve got several runs of a Pytorch training pipeline that log stuff to WandB. I’m mostly just following the tutorials, not trying to do anything fancy. Generally, I log basic numeric metrics like loss every batch, and more complex metrics like mAP and images with bounding boxes every epoch. I’m training for 100 epochs.
However, when I try to look at the charts on WandB.ai, I’m seeing some truly awful performance from the dashboard. CPU use by the browser is pegged at 100% for several minutes just trying to load the page. When things finally load, they are unresponsive, and CPU usage remains high.
Am I doing something wrong here? Am I logging too many images? (I’m logging 128 per epoch.) I couldn’t find any guidelines for this in the docs, but maybe I’m just missing them. FWIW, I used to do a similar amount of logging in TensorBoard without an issue. Also, I think that back when I was running YOLOv5 training sessions, it was also doing similar logging, and WandB never seemed to have a problem with that.