Looking to scale up my project to some cloud service, and it seems the prices are much cheaper for interruptible sessions.
How do I use W&B for an experiment (either a single train run or a HP sweep) in such an environment? Is there anything fancy needed to re-start a sweep where it left off?
I’m using Pytorch Lightning for the model/trainer and hoping to use AWS/Grid.ai/other cloud service to scale up.