Hi, I have a server with 8 GPU cards and I am just using one card for training. But I found Wandb has logged the status of all GPU cards rather than the designated one by CUDA_VISIBLE_DEVICES. I want to know whether there is a method or API that lets Wandb only log the specific GPU card.
Hi @zengchang, thanks for reaching out with your question.
Only logging the metrics for the GPU set via CUDA_VISIBLE_DEVICES
as part of the system metrics is currently not possible however, I’d be happy to raise this as a feature request for our product team to review. Would you be happy to share your use case for this?
1 Like
@fmamberti-wandb Hi, yes I am glad to share. I have multiple gpu devices and sometimes I run several training processes simultaneously but each training only occupies one card. I want to log the status of gpu for each individual training process.
Thank you for sharing this. I’ve now raised a Feature Request with our product team to review. Feel free to add any further information you would like to add to the FR
1 Like