I tried to get the observations and actions I logged when running RL in gym. But I still get a lot of NaNs even I switched from run.history() to run.scan_history() (I learned this from this link Run.history() returns different values on almost each call - #2 by jaeheelee). I thought scan_history will return all the logged values. Am I wrong?
Hi @xjygr08, apologies you are running into this! Could you send me a link to your workspace where you’ve stored your values as well as script snippet of how you are logging those values to wandb?
Run.history() does return every single logged value you have. I think I found where your issue is coming form.
Inside of your code you call: wandb.log({"obs": obs[0][0]}) wandb.log({"action": action[0]}) wandb.log(info[0])
back to back. Every time your call wandb.log, is considered you taking a new step as a part of your experiment.
So in this case, for a single iteration of the while True loop you have, you are taking three steps at wandb and they all record different parameters. That is why:
Your obs and action variables are logged 3 steps apart here:
Obs is recorded at steps 0, 3, 6, 9…
Action is at 1,4,7,10
In order to fix this issue you can either log all of your info using the same wandb.log like this: wandb.log({"obs": obs[0][0], "action": action[0], "info": info[0]})
or by specifying steps individually inside of the wandb.log():