I need to compare four runs in detail and want to plot details of the training in matplotlib. I am therefore interested in getting the exact history per training step for my metric train/avg_log_like
. These are just floating points, but now getting the complete history for the first run is already taking like 6min. 39 sec. That’s way too much! It’s only 394999
floating point numbers? That’s not a lot.
This is my code:
test = list(comparison[0].scan_history(keys=["train/avg_log_like", "_step"], page_size=500))
How can I speed this up?
Edit: it seems to me this is still subsampled? It’s just a bunch of floats, that should not be too expensive? Can I download it in a csv or something?
Hi @leander-kurscheidt, thank you for reaching out.
Regarding the metrics still being subsampled - is the train/avg_log_like
logged with every wandb.log
call? If this is not the case, it would be expected that some values are missing (scan_history will only retrieve points that are present for all the metrics specified). How many points are being retrieved in your case?
Following our docs on limits, having more than 100k points logged for a Metric is not advised and having slow performance is expected in this case.
I’d be happy to raise a feature request on your behalf to improve performance when retrieving the entire history metrics via the api. Is there anything you would like me to add to the feature request for your case and urgency on top of what you mentioned already?
hey @leander-kurscheidt - since we have not heard back from you we are going to close this request. I’l go ahead and make the feature request in the meantime, and please let us know if you have any questions