I can’t find two important metrics in the Charts or in the Overview windows.
-
GPU Mem. I train on one gpu, on a machine of 8 gpus. Where can I see on which gpu the run was run on? I see the metric “Process GPU Memory Allocated (%)”, but it is in percentage, and I need absolute numbers. I also see, per each gpu, the metric “system/gpu.0.memoryAllocatedBytes”, but I can’t see what gpu number it was run on…
-
Where can I find the training time it took for training (excluding evaluation)?
Thanks.