I am training a model in a SageMaker pipeline. The pipeline consists of a training step (a training job) and an evaluation step (a processing job). I have integrated both these jobs into run groups for each run of the pipeline. The flow is as follows:
- In the training step, the model artefacts with the best performance on the validation set are optionally logged as an Artifact to wandb. This works as expected. There are two Artifacts.
- In the evaluation step, the model artefact are loaded via s3 (not via wandb). In case the test performance is sufficient according to some criteria, the model artefacts are to be logged as Artifacts again to wandb for the evaluation run and afterwards these Artifacts are linked to the model registry.
Step 2 fails. I have attempted the following:
- Use the initiliased wandb run to call link_artifact (Run | Weights & Biases Documentation). This worked twice once for one of the models (except for the fact that created a new Model with the same name in the model registry). For the other model nothing is registered.
- Use the artifact directly to call link (Artifact | Weights & Biases Documentation). This has not been implemented…
Any support is appreciated. This seems to me to be basic and core functionality.
PS this post was flagged as ‘spam’ because it is considered an ‘advertisement’. Please let me know why. How can I advertise something when it is not even working? Furthermore, I am only providing relevant facts, I am running into a serious limitation, there is no clear customer service and only very few (basic and superficial) examples exist.