I have a local code which calls
# Retrieve the dataset artifact within the active run context
artifact = wandb.Artifact("cifar10_dataset", type="dataset")
artifact.add_reference("s3://ml-models-cicd/wandb/datasets/cifar10")
when i run the command locally:
wandb launch --uri . --job-name wandb-training-demo --project sagemaker-demo --entry-point "python train.py" -q ec2-queue
It then runs as a docker container on the remote machine whom his agent is registered to ec2-queue.
The machine itself has the right aws credentials to aws since it is able to push the image to ecr, the same policy also includes s3 access, but it seems like the container itself doesn’t have access to the ~/.aws/credentials file from the host, so i get the following error:
wandb: 🚀 View run still-sun-118 at: https://wandb.ai/dorw-lwf/sagemaker-demo/runs/epnoujrv
wandb: ⭐️ View project at: https://wandb.ai/dorw-lwf/sagemaker-demo
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20241113_170057-epnoujrv/logs
Traceback (most recent call last):
File "train.py", line 104, in <module>
artifact.add_reference("s3://ml-models-cicd/wandb/datasets/cifar10")
File "/env/lib/python3.8/site-packages/wandb/sdk/artifacts/_validators.py", line 117, in wrapper
return method(self, *args, **kwargs)
File "/env/lib/python3.8/site-packages/wandb/sdk/artifacts/artifact.py", line 1350, in add_reference
manifest_entries = self._storage_policy.store_reference(
File "/env/lib/python3.8/site-packages/wandb/sdk/artifacts/storage_policies/wandb_storage_policy.py", line 170, in store_reference
return self._handler.store_path(
File "/env/lib/python3.8/site-packages/wandb/sdk/artifacts/storage_handlers/multi_handler.py", line 54, in store_path
return handler.store_path(
File "/env/lib/python3.8/site-packages/wandb/sdk/artifacts/storage_handlers/s3_handler.py", line 177, in store_path
objs[0].load()
File "/env/lib/python3.8/site-packages/boto3/resources/factory.py", line 565, in do_action
response = action(self, *args, **kwargs)
File "/env/lib/python3.8/site-packages/boto3/resources/action.py", line 88, in __call__
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "/env/lib/python3.8/site-packages/botocore/client.py", line 569, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/env/lib/python3.8/site-packages/botocore/client.py", line 1005, in _make_api_call
http, parsed_response = self._make_request(
File "/env/lib/python3.8/site-packages/botocore/client.py", line 1029, in _make_request
return self._endpoint.make_request(operation_model, request_dict)
File "/env/lib/python3.8/site-packages/botocore/endpoint.py", line 119, in make_request
return self._send_request(request_dict, operation_model)
File "/env/lib/python3.8/site-packages/botocore/endpoint.py", line 196, in _send_request
request = self.create_request(request_dict, operation_model)
File "/env/lib/python3.8/site-packages/botocore/endpoint.py", line 132, in create_request
self._event_emitter.emit(
File "/env/lib/python3.8/site-packages/botocore/hooks.py", line 412, in emit
return self._emitter.emit(aliased_event_name, **kwargs)
File "/env/lib/python3.8/site-packages/botocore/hooks.py", line 256, in emit
return self._emit(event_name, kwargs)
File "/env/lib/python3.8/site-packages/botocore/hooks.py", line 239, in _emit
response = handler(**kwargs)
File "/env/lib/python3.8/site-packages/botocore/signers.py", line 105, in handler
return self.sign(operation_name, request)
File "/env/lib/python3.8/site-packages/botocore/signers.py", line 197, in sign
auth.add_auth(request)
File "/env/lib/python3.8/site-packages/botocore/auth.py", line 423, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials
I haven’t found any way to mount the aws credentials into the docker container which the wandb agent runs.
Any suggestions?