Parallelizing runs with multiple logical GPU's

With Google Colab (or similar large GPUs setups and JupyterHub) you can create multiple logical/virtual GPU’s and parallelize training runs assuming your models are small enough.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Create 2 virtual GPUs with 1GB memory each
  try:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024),
         tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPU,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

Is it possible to train multiple sweeps runs in parallel with logical GPU’s within a Colab like environment?

Hi Kevin,

Yes it is possible. To do so, you can do CUDA_VISIBLE_DEVICE=2 wandb agent , once for each of the GPUs on each of the machines.

I guess more specifically can this be done from a notebook environment? Looks like you’re referencing a shell command. I believe that would break the virtual devices created by TF since you’re leaving the environment in which the virtual devices were created.

Yes you can use multiple GPUs in both jupyter and google colab with wandb