Interprete gradient graphs

thongnt · September 2, 2022, 1:25pm

Hi everyone,
Attached is the gradient histograms of my training. It seems that the gradient for lin1.weight and line2.weight are mostly zero everywhere. Does it mean that the model doesn’t learn anything from these parameters and should I exclude them my optimizer?

Thank you very much

mohammadbakir · September 7, 2022, 8:41pm

Hi @thongnt , it’s difficult to say why your gradients are zeroed out. Assuming it’s not an error in your code, you may be encountering a vanishing gradient which could be leading to overflow / underflow issues. Here are some debugging steps I can suggest. 1) ensure the that you’re calling optimizer.zero_grad() before each batch 2). try normalizing the weights and inputs 3). Try implementing gradient clipping. Please let me know if any of these work for you.

mohammadbakir · September 12, 2022, 11:12pm

Hi @thongnt , since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

system · November 11, 2022, 11:12pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New to Wandb making sense of the gradient dashboard, am I seeing exploding gradients? W&B Help dashboard , wandb , beginner-friendly	4	691	January 24, 2022
How to read parameter and gradient plots generated from wandb.watch() Show the Community!	1	2837	December 1, 2021
Does W&B gradient logger work properly with gradient scaler? W&B Help questions , wandb	2	59	July 11, 2024
RuntimeError: max must be larger than min SCALER W&B Help wandb	3	521	September 26, 2022
Wanb.watch(model) causing CUDA OOM W&B Help wandb	5	1405	April 20, 2022

Interprete gradient graphs

Related topics