Local controller seems block

geyao · August 18, 2022, 6:17am

I make the following sweep (yaml) file:

program: train_mnist.py
method: grid
parameters:
  lr_schedule:
    values: [ step, cyclic ]
  epoch_total:
    values: [ 2, 4 ]
metric:
  goal: maximize
  name: test-result/accuracy
project: my-mnist-test-project
name: MNIST-Sweep-Test
description: test sweep demo

and I use local controller to perform sweep locally. However, it seems block here:

(pytorch) geyao@geyaodeMacBook-Air wandb_test % wandb sweep --controller sweep_config.yaml
wandb: Creating sweep from: sweep_config.yaml
wandb: Created sweep with ID: o2mzl569
wandb: View sweep at: https://wandb.ai/geyao/my-mnist-test-project/sweeps/o2mzl569
wandb: Run sweep agent with: wandb agent geyao/my-mnist-test-project/o2mzl569
wandb: Starting wandb controller...
Sweep: o2mzl569 (grid) | Runs: 0

# ------blocked here!------

When I turn off the network, it will be:

(pytorch) geyao@geyaodeMacBook-Air wandb_test % wandb sweep --controller sweep_config.yaml
wandb: Creating sweep from: sweep_config.yaml
wandb: Network error (ConnectionError), entering retry loop.

Why local controller tries to connect the network? How can I perform local sweep with/without network in the right way?

mohammadbakir · August 22, 2022, 6:17pm

Hi @geyao , we will attempt to reproduce on our end and reply soon. When the sweep hangs, do any errors eventually print to terminal? Or do any debug logs for the run get generated under wandb//logs you can share with us?

geyao · August 23, 2022, 2:49am

Thanks for your reply! Unfortunately, I don’t have any log information in my local wandb directory. But I find the below log in wandb cloud:

2022-08-23T02:43:06.098679 Created sweep z9kz3pzk
Using local controller...
 
2022-08-23T02:43:08.554205 Sweep configuration updated to: {"description":"test sweep demo","method":"grid","metric":{"goal":"maximize","name":"test-result/accuracy"},"name":"MNIST-Sweep-Test","parameters":{"epoch_total":{"values":[2,4]},"lr_schedule":{"values":["step","cyclic"]}},"program":"train_mnist.py","project":"my-mnist-test-project","controller":{"type":"local"}}

mohammadbakir · August 25, 2022, 9:50pm

Hi @geyao , you aren’t blocked,

Sweep: o2mzl569 (grid) | Runs: 0 is expected behavior as your sweep will be in a ‘Pending’ state once you initiate the Local Controller. Once you begin running your sweep, wandb agent o2mzl569, your sweep will now execute and you can step through the sweep controller using your python script.

geyao · August 26, 2022, 1:22pm

Thanks for your answer! It works when I start a new terminal to run the agent. But I still want to know: is it necessary for local controller to connect to W&B cloud service?

mohammadbakir · September 1, 2022, 12:39am

Hi @geyao , the local controller doesn’t have the full functionality of W&B cloud, and is not intended for actual hyperparameter optimization workloads. It’s intended for development and debugging of new algorithms for the Sweeps tool. You don’t need to connect to W&B cloud service to use the controller.

system · October 31, 2022, 12:39am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to run wandb.sweep in Offline mode W&B Help sweeps , wandb	4	672	July 26, 2024
Sweep on remote cluster GPUs W&B Help sweeps	5	1179	September 18, 2022
Broken Pipe error W&B Help sweeps , wandb	2	1408	February 9, 2024
(Windows 11) `wandb.sweep()` gives ConnectionResetError: [WinError 10054] W&B Help sweeps	6	1478	January 17, 2023
Encountering network error when running sweep W&B Help	6	534	June 27, 2023

Local controller seems block

Related topics