ERROR: Project does not contain artifact

I’m running the following code to initialize a run and use my dataset as an artifact:

    run = wandb.init(name=job_name, project=wandb_project_name, config=vars(args), save_code=True, job_type="training")
    wandb.run.log_code(".")
    print(wandb_dataset_name)
    dataset = run.use_artifact(wandb_dataset_name)

This code is in a Sagemaker script and when I run, everything works as expected. However, when I run the same exact script in a Sagemaker hyper parameter tuning job instead of a single training job, I get the following error:

wandb: WARNING Calling wandb.login() after wandb.init() has no effect.
distributedspectrum/RadioML-Experimentation/RadioML_tfrecords:v0
wandb: WARNING Calling wandb.login() after wandb.init() has no effect.
wandb: WARNING Calling wandb.login() after wandb.init() has no effect.
wandb: ERROR Project distributedspectrum/RadioML-Experimentation does not contain artifact: “RadioML_tfrecords:v0”
Traceback (most recent call last):
File “/usr/local/lib/python3.8/site-packages/wandb/apis/normalize.py”, line 26, in wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 937, in artifact
artifact = Artifact(self.client, entity, project, artifact_name)
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 4151, in init
self._load()
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 4735, in _load
raise ValueError(
ValueError: Project distributedspectrum/RadioML-Experimentation does not contain artifact: “RadioML_tfrecords:v0”
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “radioml-training.py”, line 306, in
main(args)
File “radioml-training.py”, line 175, in main
dataset = run.use_artifact(wandb_database_name)
File “/usr/local/lib/python3.8/site-packages/wandb/sdk/wandb_run.py”, line 255, in wrapper
return func(self, *args, **kwargs)
File “/usr/local/lib/python3.8/site-packages/wandb/sdk/wandb_run.py”, line 2575, in use_artifact
artifact = public_api.artifact(type=type, name=name)
File “/usr/local/lib/python3.8/site-packages/wandb/apis/normalize.py”, line 62, in wrapper
raise CommError(message, err).with_traceback(sys.exc_info()[2])
File “/usr/local/lib/python3.8/site-packages/wandb/apis/normalize.py”, line 26, in wrapper
return func(*args, **kwargs)
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 937, in artifact
artifact = Artifact(self.client, entity, project, artifact_name)
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 4151, in init
self._load()
File “/usr/local/lib/python3.8/site-packages/wandb/apis/public.py”, line 4735, in _load
raise ValueError(
wandb.errors.CommError: Project distributedspectrum/RadioML-Experimentation does not contain artifact: “RadioML_tfrecords:v0”

Literally everything is exactly the same but I suddenly get this error. I’m also not sure why the warnings about wandb.login() appear as well. They don’t appear in the single training job and I don’t ever call wandb.login() in my code.

Hi @dspectrum , happy to help with the issue you are facing.

A few things come to mind.

  • Do you see this artifact in the UI?
  • Are you able to get the artifact via the command line? wandb artifact get <entity>/<project>/<artifact-name>
  • Have you verified your sagemaker host environmental variable is referenced correctly prior to executing the training ? wandb status to check “base_url”. To set, use export WANDB_BASE_URL=<HOST>:<PORT>

Yes, the artifact is in the UI! I am able to get everything to work perfectly when I run in Sagemaker with one instance. It’s just when I launch a hyperparameter tuning job that it doesn’t work.

Hi @dspectrum, following up on this. We ran some tests and were successful in utilizing artifacts within sagemaker. Are you still experiencing the same errors as before.

One thing you can try is to include the full entity/project path to the artifact name. For example, instead of

artifact_name = “RadioML_tfrecords:v0”
use
artifact_name = “<entity>/<project-name>/RadioML_tfrecords:v0”

it might be that in SM, the entity and project it inherits from the run is different than where the artifact is.

Thanks for following up! To clarify, I have also been able to successfully use artifacts in sagemaker. It’s only when I switch from a normal training job to a hyper parameter tuning job (HyperparameterTuner — sagemaker 2.116.0 documentation) that this error occurs.

Also apologies if this was unclear but I am actually referring to the full path in my code.

When I run print(wandb_dataset_name) you can see that I prints distributedspectrum/RadioML-Experimentation/RadioML_tfrecords:v0

Thanks for the help so far, let me know what you think!

Hi @dspectrum , thank you for the clarifying remarks. I’ll run some additional tests on our end as I was still unable to replicate. In the meantime could you try use_artifactuse_artifact("<entity>/<project>/<artifact>:latest")

Hi @dspectrum , following up on this. Are you still experiencing issues with using artifacts with your hyperparameter training runs with sagemaker?

Hi @dspectrum since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

Hi,
Very sorry for the delay here. I tried again with :latest instead of :v0 but got the same result.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.