WandB and AWS Lambda

We’re trying to run WandB (0.13.4) in an AWS Lambda with Python 3.9. We have set an environment variable for our API Key. But we are getting this error message from Lambda:

{
  "errorMessage": "Error communicating with wandb process",
  "errorType": "UsageError",
  "requestId": "0b6c4576-adbe-4182-826a-eca4dd06bc6f",
  "stackTrace": [
    "  File \"/var/task/lambda_monitor.py\", line 73, in test_harness\n    run  = wandb.init(project= WB_PROJECT,\n",
    "  File \"/var/task/wandb/sdk/wandb_init.py\", line 1078, in init\n    run = wi.init()\n",
    "  File \"/var/task/wandb/sdk/wandb_init.py\", line 719, in init\n    raise UsageError(error_message)\n"
  ]
}

The calling code is this:

    run  = wandb.init(project= WB_PROJECT,
                      id     = WB_RUN_ID,  # We force the run to continue with the specific monitoring Run ID
                      resume = True,
                      settings=wandb.Settings(start_method="fork"))

And this error message from CloudWatch:

wandb: WARNING Path /var/task/wandb/ wasn't writable, using system temp directory.
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
wandb: WARNING Path /var/task/wandb/ wasn't writable, using system temp directory
wandb: W&B API key is configured. Use `wandb login --relogin` to force relogin
Traceback (most recent call last):
File "/var/lang/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/var/lang/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/var/task/wandb/__main__.py", line 3, in <module>
cli.cli(prog_name="python -m wandb")
File "/var/task/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/var/task/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/var/task/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/var/task/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/var/task/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/var/task/wandb/cli/cli.py", line 97, in wrapper
return func(*args, **kwargs)
File "/var/task/wandb/cli/cli.py", line 282, in service
server.serve()
File "/var/task/wandb/sdk/service/server.py", line 142, in serve
mux.loop()
File "/var/task/wandb/sdk/service/streams.py", line 394, in loop
raise e
File "/var/task/wandb/sdk/service/streams.py", line 392, in loop
self._loop()
File "/var/task/wandb/sdk/service/streams.py", line 385, in _loop
self._process_action(action)
File "/var/task/wandb/sdk/service/streams.py", line 350, in _process_action
self._process_add(action)
File "/var/task/wandb/sdk/service/streams.py", line 203, in _process_add
stream = StreamRecord(action._data, mailbox=self._mailbox)
File "/var/task/wandb/sdk/service/streams.py", line 61, in __init__
self._record_q = multiprocessing.Queue()
File "/var/lang/lib/python3.9/multiprocessing/context.py", line 103, in Queue
return Queue(maxsize, ctx=self.get_context())
File "/var/lang/lib/python3.9/multiprocessing/queues.py", line 43, in __init__
self._rlock = ctx.Lock()
File "/var/lang/lib/python3.9/multiprocessing/context.py", line 68, in Lock
return Lock(ctx=self.get_context())
File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 162, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 57, in __init__
sl = self._semlock = _multiprocessing.SemLock(
OSError: [Errno 38] Function not implemented
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...
wandb: / Waiting for wandb.init()...
wandb: - Waiting for wandb.init()...
wandb: \ Waiting for wandb.init()...
wandb: | Waiting for wandb.init()...

I added the additional Environment Variables as mentioned by Akshey (here):

WANDB_CACHE_DIR= /tmp/
WANDB_CONFIG_DIR=/tmp/
WANDB_DIR=/tmp/
WANDB_SILENT=true

And the Cloudwatch log becomes:

Traceback (most recent call last):
File "/var/lang/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/var/lang/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/var/task/wandb/__main__.py", line 3, in <module>
cli.cli(prog_name="python -m wandb")
File "/var/task/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/var/task/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/var/task/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/var/task/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/var/task/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/var/task/wandb/cli/cli.py", line 97, in wrapper
return func(*args, **kwargs)
File "/var/task/wandb/cli/cli.py", line 282, in service
server.serve()
File "/var/task/wandb/sdk/service/server.py", line 142, in serve
mux.loop()
File "/var/task/wandb/sdk/service/streams.py", line 394, in loop
raise e
File "/var/task/wandb/sdk/service/streams.py", line 392, in loop
self._loop()
File "/var/task/wandb/sdk/service/streams.py", line 385, in _loop
self._process_action(action)
File "/var/task/wandb/sdk/service/streams.py", line 350, in _process_action
self._process_add(action)
File "/var/task/wandb/sdk/service/streams.py", line 203, in _process_add
stream = StreamRecord(action._data, mailbox=self._mailbox)
File "/var/task/wandb/sdk/service/streams.py", line 61, in __init__
self._record_q = multiprocessing.Queue()
File "/var/lang/lib/python3.9/multiprocessing/context.py", line 103, in Queue
return Queue(maxsize, ctx=self.get_context())
File "/var/lang/lib/python3.9/multiprocessing/queues.py", line 43, in __init__
self._rlock = ctx.Lock()
File "/var/lang/lib/python3.9/multiprocessing/context.py", line 68, in Lock
return Lock(ctx=self.get_context())
File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 162, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 57, in __init__
sl = self._semlock = _multiprocessing.SemLock(
OSError: [Errno 38] Function not implemented
Problem at: /var/task/lambda_monitor.py 73 test_harness[DEBUG]	2022-11-01T07:51:38.952Z	b1a9ad4c-e05a-40d4-b2cb-14eaa17be2c0	Starting new HTTPS connection (1): o151352.ingest.sentry.io:443
[ERROR] UsageError: Error communicating with wandb process
Traceback (most recent call last):
  File "/var/task/lambda_monitor.py", line 73, in test_harness
    run  = wandb.init(project= WB_PROJECT,
  File "/var/task/wandb/sdk/wandb_init.py", line 1078, in init
    run = wi.init()
  File "/var/task/wandb/sdk/wandb_init.py", line 719, in init
    raise UsageError(error_message)

Hi Leslie,
Any updates?

Kevin A. Shaw, Ph.D.
CTO / CoFounder | Algorithmic Intuition Inc
www.algorithmicintuition.com
kevin@algoint.com | Twitter: @kevinashaw

Hi Kevin, thank you for your patience! From your logs it looks like you are trying to use multiprocessing with AWS Lambda. AWS lambda doesn’t have a shared memory folder hence why you’re running into this issue. You can try to use spawn instead of fork in your wandb init function but since this is due to AWS Lambda, it might not be able to help here