Integrating Wandb and AWS Lambda - Multiprocessing

Hello!

When I try to download an artifact in an AWS Lambda, I get the following error:

[ERROR] OSError: [Errno 38] Function not implemented
Traceback (most recent call last):
  File "/var/task/on_lambda.py", line 149, in lambda_entrypoint
    generate_predictions(
  File "/var/task/on_lambda.py", line 75, in generate_predictions
    models_data = get_models_data()
  File "/var/task/on_lambda.py", line 156, in <lambda>
    get_models_data=lambda: get_customer_models_data(
  File "/var/task/models.py", line 38, in get_customer_models_data
    artifact_dir = artifact.download(root=CUSTOMER_DATA_DIR)
  File "/var/task/wandb/apis/public.py", line 3763, in download
    pool = multiprocessing.dummy.Pool(32)
  File "/var/lang/lib/python3.9/multiprocessing/dummy/__init__.py", line 124, in Pool
    return ThreadPool(processes, initializer, initargs)
  File "/var/lang/lib/python3.9/multiprocessing/pool.py", line 927, in __init__
    Pool.__init__(self, processes, initializer, initargs)
  File "/var/lang/lib/python3.9/multiprocessing/pool.py", line 196, in __init__
    self._change_notifier = self._ctx.SimpleQueue()
  File "/var/lang/lib/python3.9/multiprocessing/context.py", line 113, in SimpleQueue
    return SimpleQueue(ctx=self.get_context())
  File "/var/lang/lib/python3.9/multiprocessing/queues.py", line 341, in __init__
    self._rlock = ctx.Lock()
  File "/var/lang/lib/python3.9/multiprocessing/context.py", line 68, in Lock
    return Lock(ctx=self.get_context())
  File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 162, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
  File "/var/lang/lib/python3.9/multiprocessing/synchronize.py", line 57, in __init__
    sl = self._semlock = _multiprocessing.SemLock(

Is there a way around this? It seems to be an issue with multiprocessing. I tried setting WANDB_START_METHOD = thread which I saw mentioned in a few places, but the error doesn’t change.

This is how I’m downloading the artifact:

    api = wandb.Api()
    artifact = api.artifact(artifact_name)
    artifact_dir = artifact.download(root=data_dir)

I saw two other threads on this topic but they didn’t have a solution

Thanks!

Hi @chiara_mc,

AWS Lambda does not support the use of Semaphores at the moment : python multiprocessing - _multiprocessing.SemLock is not implemented when running on AWS Lambda - Stack Overflow

Is there a specific use case you have in mind? We can see if there is any way to work around this issue since the direct usage of artifacts does not seem possible.

Thanks,
Ramit

Hi @chiara_mc, I wanted to follow up on this request. Please let us know if we can be of further assistance or if your issue has been resolved.

Hi @chiara_mc, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

Hey @ramit_goolry, thanks for your reply!

We need the lambda to be able to load models and run them, and we liked the idea of being able to retrieve them directly from our wandb’s model registry using aliases, such that we can easily change what models are being used in prod from the wandb website.

The workaround we are using is to store all model version in our own s3 bucket, using artifact references so that we can still manage the model versions through wandb. The lambda is then able to call wandb to know which versions to use (using the ETag), and can load them from S3.

Actually, we realised that while artifact.download() causes issues, we can download individual files using artifact.get_path(model_path).download() without issues (at least when they are stored on S3, not sure if it would also work if they were stored on wandb).

If you have any thoughts on this I’d be keen to hear them!

Hey @chiara_mc,

I see. Since Semaphores are not available on AWS Lambda, Artifacts are not directly going to be available through Lambda. If you are using Artifact references linked to an S3 bucket, you should be able to use the W&B API to get the reference’s path and use that alongside boto3 to download them.

Thanks,
Ramit

Hi Chiara,

We wanted to follow up with you regarding your support request as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Best,
Weights & Biases

Hi Chiara, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

great post. thanks for sharing. AWS Training in Pune enables you to attain the skills needed to clear the AWS Certified Solutions Architect exam. Join now for the best AWS Certification Course in Pune.

AWS course in Pune

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.