GPU Utilization Info Not Showing Up in System Panels

On an cluster with AMD GPUs managed with SLURM, logged in a compute node I allocated and started some model evaluation jobs. A Wandb tracker is intiialized with Accelerator.init_trackers() while the Accelerator is from the Huggingface Accelerate framework.
However, on the monitor, the System charts contains CPU informations only such as Process CPU Threads in Use and Process Memory Available. There’s no any tracking information about the GPU memory usage and GPU memory utilization. However, my coworker ran the same script on another cluster with AMD GPUs, but there are GPU tracking info on his panels.
I need help troubleshooting this since there isn’t any error or warning message from Wandb. And I don’t know what is disabling it on my platform and the docs only says it gets those information from rocm-smi -a --json, but I’ve checked that rocm-smi is functioning on that platform.

Hi @yic033 Good day and thank you for reaching out to us! Happy to help you on this.

Regarding your coworker who’s able to have GPU tracking on his script, do you know the difference on your wandb versions? Could you please try it for me by using the same wandb version and see if this helps? Thank you!

Hi @yic033 , since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!