Does W&B gradient logger work properly with gradient scaler?

Hi, I’m using huggingface trainer framework to train my model and logging everything to W&B. The W&B histogram logger for gradients shows me the following pictures:

The out_proj layer is the last one in my architecture, so it was very unusual to see so high gradients magnitude. I took a look inside and found that the gradients are actually small, but they do get high within intermediate step of AMP gradient scaling and unscaling. So I guess that W&B just doesn’t work well with grad scaler.

Before unscaling:

model_ref.classifier.out_proj.weight.grad
tensor([[  354.6875, -1280.5000,   538.1250,  ...,  1188.5000,  -150.5625,
          1870.5000],
        [   93.0205, -1208.0000,   390.3750,  ...,   534.8750,   -38.1250,
           738.2500],
        [   -5.9375,  2244.5000, -1384.5000,  ...,  -503.1875,   -11.0625,
         -1256.3125],
        [  211.8281,   488.5000,    37.3750,  ..., -1205.5000,   494.1250,
         -1308.5000],
        [ -664.7500,   -67.0000,  -156.6250,  ...,   185.0000,   -59.9531,
           923.7500],
        [   11.0000,  -177.6250,   574.6250,  ...,  -199.1250,  -234.6953,
          -966.3750]], device='cuda:0')

After unscaling:

model_ref.classifier.out_proj.weight.grad
tensor([[ 5.4121e-03, -1.9539e-02,  8.2111e-03,  ...,  1.8135e-02,
         -2.2974e-03,  2.8542e-02],
        [ 1.4194e-03, -1.8433e-02,  5.9566e-03,  ...,  8.1615e-03,
         -5.8174e-04,  1.1265e-02],
        [-9.0599e-05,  3.4248e-02, -2.1126e-02,  ..., -7.6780e-03,
         -1.6880e-04, -1.9170e-02],
        [ 3.2322e-03,  7.4539e-03,  5.7030e-04,  ..., -1.8394e-02,
          7.5397e-03, -1.9966e-02],
        [-1.0143e-02, -1.0223e-03, -2.3899e-03,  ...,  2.8229e-03,
         -9.1481e-04,  1.4095e-02],
        [ 1.6785e-04, -2.7103e-03,  8.7681e-03,  ..., -3.0384e-03,
         -3.5812e-03, -1.4746e-02]], device='cuda:0')

If that is correct, I think this should be fixed to avoid misunderstanding (e.g. I though something is wrong with my model).

Hi there! Thanks so much for writing in - let me dig into this a bit and see what might be going on. I’ll follow up here as soon as I have something. In the meantime don’t hesitate to reach out with anything else you might need :slightly_smiling_face:

Hi there!

I wanted to follow up and let you know that I’ve reported this issue to our product team. We’re actively looking into adding it to our roadmap for improvement. I really appreciate your patience as we work on enhancing this aspect of our platform.

Please don’t hesitate to reach out if you have any other questions or concerns in the meantime. Thanks again for bringing this to our attention - your feedback is invaluable in helping us make W&B better for all our users.