Is there a way to pull artifact metadata into reports? Here is my use case:
I have several datasets that I use for evaluation of my model. They are stored as input artifacts and have metadata associated with them for example data type, source, and size. I would like to put a table in my report which outlines the size and type of each dataset, and ideally, could perform excel-like operations on the table (e.g. to sum the sizes of each dataset to create a total, or to get a label distribution).
Right now I build these tables manually but it is error prone and time consuming. Here is an example:
Dataset
Size
Sources
% Positive Class
A
1333
[‘type-1’ ‘type-2’]
0.63
B
13308
[‘type-1’ ‘type-2’ ‘type-3’]
0.63
C
1153
[‘type-2’ ‘type-3’]
0.61
D
273
[‘type-1’]
0.66
Is there any existing way to do this? Or a hack to get something like it?
Hi @elhutton thank you for writing in! You can do this using Weave with a couple of options:
i) get the metadata for a single artifact you have created with the following Weave expression (in your Report press “/” and add Weave) project("entity-name", "project-name").artifact("artifact-name").versions("v0").metadata
ii) get a table of artifacts for a specific artifact type in a Weave again expression such as:
project(“entity-name”, “project-name”).artifactType(“dataset”).artifactVersions.metadata
iii) render directly the artifacts metadata based on a specific artifact type as follows:
project(“entity-name”, “project-name”).artifactType(“dataset”).artifacts
Would any of these work for you? Please let me know if you have any further questions or issues with this!
project(“entity-name”, “project-name”).artifactType(“dataset”).artifactVersions.metadata is getting me close to what I want. How would I then filter the table to only show the latest version of each artifact? Right now this view shows a different row for each version of each artifact, but I would like to show only the latest version of each artifact.
Also, is there a way to perform operations on the values within the columns of such a table? For example, if I have a field called “n_examples” in the metadata for each artifact of a certain type, could I then get the sum of the n_examples over all the artifacts?
Hi @elhutt apologies for the late response to this one. Showing only the latest version of each artifact should be possible by the following expression: project("entity", "project").artifactType("dataset").artifacts.membershipForAlias("latest").artifactVersion.metadata
You can indeed add columns and perform operations or group by the table. Each column has a cell expression that allows you to do some operations. You can find some examples in this Report.
To get the sum of all rows you will need a new Weave expression as follows: project("entity", "project").artifactType("dataset").artifacts.membershipForAlias("latest").artifactVersion.metadata["n_examples"].sum
Hope this helps, please let me know if you have any further questions.
Hi @elhutton as this seems to be resolved for you, I will close the ticket for now. Please let me know in case the above solution didn’t work for you and we will be happy to keep investigating!