Access local filesystem artifacts without downloading

shababo-sci · March 21, 2023, 8:04pm

I would like to use Artifacts to log and track the usage of my datasets. These datasets live on a local filesystem. I was able to create a reference artifact, but the problem I encounter is that the only way to access the original local filepath is to call artifact.download() or artifact.get_path(name).ref.

Calling download() doesn’t work for me because the files are aleady local and very large. I definitely do not want to make a copy.

On the other hand, artifact.get_path(name).ref works, but this entails already knowing the path of the file since that is what is used for name as far as I can tell. I suppose even if I could set a custom name for each file in the directory (can you?), I’m not sure one can retrieve those names from the artifact itself and therefore they would need to be known by anyone using the artifact. Ideally one would only need the artifact’s name and from there you can see the local file paths for all of the files in that artifact.

In case it’s helpful, I add these files to the artifact by doing:

artifact.add_reference(name='data_folder',uri='file://path/to/directory')

When I use the artifact, I can do

files = artifact.files(),

which returns an iterable of all of the files, but these File objects do not have a way to get the path/uri either.

Is there anyway to do this?

Thanks and let me know if you have any questions that will help you understand or solve this.

shababo-sci · March 21, 2023, 8:58pm

Ope… I just figured it out… seems like you can introspectively obtain the refs by looking through the manifest.

thanos-wandb · March 23, 2023, 7:49pm

Hi @shababo-sci thanks for writing in, and glad to hear you figure this out! I will also post a code snippet here for any future reference.

api = wandb.Api()
artifact = api.artifact('entity/project/artifact-name:alias', type='artifact-type')

# First option
for k,v in artifact.manifest.entries.items():
  print(v.ref)

# Second option
for f in artifact.files():
  print(f.url)

I hope this helps, feel free to ask us any further questions!

system · May 22, 2023, 7:49pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Strategy for adding referenced files to an artifact W&B Help artifacts	5	845	December 4, 2022
Listing files of refence artifacts with temporary mounted folder (Azure) W&B Help artifacts , wandb	5	540	January 24, 2022
Artifact download link W&B Help artifacts , wandb	5	789	March 21, 2023
Version tracking for artifacts added by reference (files on local system) W&B Help artifacts , wandb	2	451	December 1, 2023
Get S3 Filepath for WandB Artifact W&B Help wandb	3	801	July 11, 2024

Access local filesystem artifacts without downloading

Related topics