Pulling metadata from run associated with artifact into report

Hi! I’m trying to pull metadata from a run associated with an artifact into a report table, and am not succeeding.

Here’s my use case: I have an artifact “artifact” generated by some runs (“train-run”), and then I use that artifact in a (separate) evaluation run (“evaluate-run”).
I then want to display in a table the scores for my evaluation run, together with the training data that was used in the “train-run”.

I want something like this:

run artifact train set
evaluation-run-1 artifact:v0 “/data/train.csv”
evaluation-run-2 artifact:v0 “/data/train.csv”
evaluation-run-3 artifact:v1 “/data/train_filtered.csv”

with the caveat that I don’t have the train set logged in the evaluation runs, but only in the training runs (that generated the artifact).

Here’s what I’m trying:

I have managed to get the artifact column as above using Weave with this query for the column

row.run.usedArtifactVersions.filter((row) => row.name.contains("artifact")).link

But I can’t get to the config of the run that generated that artifact.
The closest I’ve gotten to this output is to access the metadata of the artifact version in Weave, but this doesn’t really work. The UI (in artifacts->metadata) shows the metadata of the run that generated it, which would work for me, but it doesn’t seem to be available to Weave.

This is what I have tried to access the metadata, and isn’t working:

row.project.artifactVersion("artifact", "v0").metadata["train_set"]

Another thing I’ve tried and isn’t working is this:

row.run.usedArtifactVersions.filter((row) => row.name.contains("artifact")).map((row, index) => row.run.config)["train_set"].joinToStr("")

(unfortunately the row.run in the map function refers to the original run, instead of the run associated with generating the artifact)

Hi @dcferreira, quick clarification. Is the “train set” in the “train-run” run metadata or is it in the Artifact metadata?

Also, could you possibly share a link to the project and I can take a look?

Thank you,
Nate

Hey @nathank!

“train set” is run metadata. I didn’t log any specific metadata on the Artifact. I’m hoping that given a specific Artifact, I’d be able to access the “train set” for the run that first generated it.

I can’t share the exact project I’m having this problem in, but can try to make a minimal example later today.

I made this report here with a minimal example: Weights & Biases

It’s basically the table I showed in the first post, but it’s missing the last column.

You should have access to both the report and all the runs, but please let me know if I did something wrong regarding permissions.

Hello! I am taking over this ticket for Nate!

Firstly, your Artifact link query can be changed to the following for succintness:

row.usedArtifactVersions.link

Also, after looking in further, I believe the query that you are looking for is the one you mentioned

row.project.artifactVersion("artifact", "v0").metadata["train_set"]

but unfortunately I believe that the .metadata isn’t working as intended so I will have to contact our team about it and file a report. It looks like artifactVersion does not have access to the metadata so I will have to ask them about if there is a way to work around this.

Hi Raphael! Thanks for the tip.

Please let me know when you hear back then :slight_smile:

Also just wanted to point out that I’m not married to the Weave approach: if there’s some other way to get the same outcome I’d be all for it!

Hello!

I looked into this a bit more and actually the reason that the metadata is empty is due to it truly being empty. It may look rather confusing but the “train_set” is actually under the run’s config.

Since it is like this, it would be better to start at the artifact initial run and go from there to get the artifact, which runs used the artifacts, the config, and the run itself. I made some additional edits here to show how to do what I am referencing.

Thanks for the help Raphael.

However, I believe it doesn’t really answer my original question. Please let me know if I misunderstood your post.

Indeed for this very simple example it’s relatively easy to look at the table and see which runs had data from where. But the example I’m dealing with has tens of runs, and a few different artifact versions, some of which share the “train_set”.

I’d really like to have the “train_set” side by side with my “evaluate” runs on the table. This would also enable me to group by “train_set” in plots!

Hey @dcferreira, I’ll be taking over this ticket temporarily - based on my understanding, it looks like you are trying to attach the train set values to the top 3 evaluation runs in this report here which are empty

Unfortunately, there isn’t a way to do this in Weave, but you can edit the run config (config[‘train_set’]) directly after the run has finished, and be able to load those values in the above table. Documentation on how to do this can be found here

Please let me know if this helps or if you have any follow up questions!

Hi @dcferreira, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please feel free to write back in!

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.