SLURM and Launch-agent

19kdc3 · May 29, 2024, 4:39pm

Goal:
Run a wandb launch-agent on an linux HPC using a SLURM job scheduler.

Details:
All python files needed to execute my training run are uploaded to a wandb job artifact. I want to launch that job to a queue in which an remote linux HPC will download all the job artifacts and begin the training run. I want to bypass the need/use of docker by creating a permanent virtual python environment on the remote linux HPC and simply have SLURM activate the virtual environment.

Question:
Can I setup/use a launch-agent without a builder? As I already have the virtual python environment on the remote HPC and have SLURM activating the virtual python environment, can I have the launch-agent simply download the job artifacts and execute a python file from them?

Thanks for the help!

thanos-wandb · May 31, 2024, 9:36am

Hi @19kdc3 thank you for reaching out! The Slurm integration for Launch is on our roadmap for after Q2. We have a feature request logged, and I can add you to the internal ticket to keep you updated on progress.

Regarding your question, having a builder is not essential; it can be a no-op builder for prebuilt images as documented here. Let me know if you had more questions on this.

19kdc3 · May 31, 2024, 2:24pm

Hi @thanos-wandb thanks for replying! Yes please, if you could add me to the internal ticket that would be great!

I’ve decided to move forward using docker for the time being. So you can close this ticket thanks!

thanos-wandb · May 31, 2024, 2:53pm

Hi @19kdc3 thank you for the update. It’s great to hear you have a working solution with Docker for now! I have added you to the internal ticket and will move this one to on-hold until there are updates from the team, which I will share with you here.

daniel-bogdoll · October 17, 2024, 5:28pm

@thanos-wandb Could you please also put me on the internal ticket? Using SLURM with launch would be much appreciated

thanos-wandb · October 30, 2024, 9:25am

hi @daniel-bogdoll sure thing! I have +1 the requests and will keep you updated on any progress on this here.

thomas-gorman · March 13, 2025, 11:51am

Hi @thanos-wandb Can you please also add me to this ticket? Support for this would greatly simplify my orgs setup.

Topic		Replies	Views
Launch-agent crash without trace or error log W&B Help wandb	6	276	May 29, 2024
Setting up launch agent using custom Docker image W&B Help wandb	0	51	November 19, 2024
Sweep agent will always start another run after finishing (on SLURM) W&B Help sweeps	4	271	July 3, 2024
Accelerate launch and WandB agent , run the main function 4 seperate times for 4 GPUS W&B Help sweeps , wandb	3	1636	April 9, 2023
Resources on how to use wandb docker W&B Help projects , resources , wandb , beginner-friendly	2	1439	April 20, 2022

SLURM and Launch-agent

Related topics