SLURM and Launch-agent

Run a wandb launch-agent on an linux HPC using a SLURM job scheduler.

All python files needed to execute my training run are uploaded to a wandb job artifact. I want to launch that job to a queue in which an remote linux HPC will download all the job artifacts and begin the training run. I want to bypass the need/use of docker by creating a permanent virtual python environment on the remote linux HPC and simply have SLURM activate the virtual environment.

Can I setup/use a launch-agent without a builder? As I already have the virtual python environment on the remote HPC and have SLURM activating the virtual python environment, can I have the launch-agent simply download the job artifacts and execute a python file from them?

Thanks for the help!

Hi @19kdc3 thank you for reaching out! The Slurm integration for Launch is on our roadmap for after Q2. We have a feature request logged, and I can add you to the internal ticket to keep you updated on progress.

Regarding your question, having a builder is not essential; it can be a no-op builder for prebuilt images as documented here. Let me know if you had more questions on this.

Hi @thanos-wandb thanks for replying! Yes please, if you could add me to the internal ticket that would be great!

I’ve decided to move forward using docker for the time being. So you can close this ticket thanks!

Hi @19kdc3 thank you for the update. It’s great to hear you have a working solution with Docker for now! I have added you to the internal ticket and will move this one to on-hold until there are updates from the team, which I will share with you here.