There are two things you might be running into here – can’t confirm because your code relies on the ultimate-utils package.
-
wandb.watchwill only start working once you callwandb.logafter a backwards pass that touches the watchedModule(docs). - The frequency with which gradients/params are logged is controlled by the
log_freqargument. If the number of logging calls is less than the value oflog_freq, then no information will be logged. Here’s a short colab reproducing this behavior.
Also, if you want params and gradients, you need to set the log kwarg to "all". By default, we log only gradients.