I am experiencing weird behaviour of the run.history() function in Python:
Calling run.history(samples = 100) gives me sample sizes different to 100 and the sample size varies for each call. E.g. executing it 5 times gave me sample sizes 98, 90, 88, 110, 104.
However, when I execute run.history(keys=['my_key'], samples=100), I get a sample size of exactly 100 for every call. Why is this the case?
After investigating this further, I found more strange behaviour: Calling run.history(keys=['my_key'], samples=n_samples) yields a sample size of exactly n_samples, as long as n_samples <= 12493 (at least for my test run). If n_samples > 12493, smaller sample sizes (varying roughly between 12400 and 12490) are returned.
Am I understanding something wrong or are these functions behaving in a way that they shouldn’t?
run.scan_history() per definition returns the full history, not a sampled version. It behaves as expected, but my use case is that I want to downsample long runs to a specific number of samples, so this does not solve my issue.
Thanks for the explanation! run.history() is not behaving correctly here - I had already filed a bug report for this to be tracked internally. For now, I would suggest using run.history(keys=['my_key'], samples=100) in order to get all your metrics in that case.
I will respond back here once this issue has been resolved.