Why the weights for my model are not logged while I can see the gradients?


I would like to track the weights and gradients for my model while it trains. I have three networks that are being trained jointly. I am calling ‘wandb.watch’ for each one of them individually. This allowed me to see the gradients, but for some reason, I can’t see the weights for one of these networks in the ‘parameters’ tab.

I would appreciate if anyone could help me to figure out what is the issue and how I can fix it.


Hi @ahof1704,

Have you set the log='all' parameter for wandb.watch?


Hi @ramit_goolry ,

Maybe I should provide more details about my model. It consists of 3 neural networks (NN):

Encoder = NN1()
Decoder = NN2()
model = NN3()

For training, I make a list of all the parameters and pass that to the optimizer as follows

All_parameters = list(model.parameters())+list(Encoder.parameters())+list(Decoder.parameters())
optimizer = torch.optim.Adam(All_parameters, lr=args.lr, weight_decay=args.weight_decay)

Since wand.watch takes just one network at a time, I call that command three times in my code:

wandb.watch(Encoder, log="all",log_freq=1)
wandb.watch(Decoder, log="all",log_freq=1)
wandb.watch(model, log="all",log_freq=1)

This allows me to see the gradients for all three networks, but not the weights. Only the weights for Encoder and Decoder are listed on the Parameters tab.

I hope this clarifies the problem I am having.

Thank you for helping!

I see. Could you share a link to a project workspace where you see this? I’ll look into this for you.

Yes, here it is: https://wandb.ai/ahof1704/ANIE/runs/n3qqff5n?workspace=user-ahof1704
Please let me know if you need any further information.



Let me know if my understanding of your pipeline incorrect, but is there a reason you are not setting your model up as full_model = nn.Sequential([Encoder, model, Decoder])? You should be able to watch over your whole model as wandb.watch(full_model).

wandb.watch usually hooks into a PyTorch model - my suspiscion here is that watch is only keeping track of the last model that is being “watched” - since I do see a set of parameters being tracked.

Hi Antonio,

