I came across this issue recently, and I was wondering whether anything can be done to speed up this process. We are using W&B as source of truth for versioning of our datasets. Each dataset is an artefact in a specific project, and files making up this dataset are added as references (everything is stored on S3).
We sometime need to retrieve the path (including version) to a specific file in the artefact. This is typically very fast (<1s) but for larger artefact (made up of >10K references), the process can slow down significantly and take up to 30 seconds. We realized that this holds true whenever we try to access the artifact for the first time (e.g. getting its digest).
Is it expected that artifacts with a large number of files will result in long wait for the first operation when accessing in programmatically in Python?
We typically use the public API to access the artifact (see below for example) but the same happens when using a run.
Hi Nicolas, if the artifact is large, it will take longer for it to download for the first operation. As you already stated, if you indicate the path, it’ll be faster to download, and you can also indicate the type of artifact you want in order to get a specific datatype from the artifact by using the following code:
runs = api.runs(…) for run in runs:
for artifact in run.logged_artifacts():
if artifact.type == “model”:
artifact.download()
In my case, I am not trying to download the artefact, but merely to get the path to one of the file it includes. It is not obvious to me that the length of this operation should be proportional to the size of the dataset?
Thank you for the clarification. We are currently working on optimizing our artifacts to speed up artifact.get, but yes currently to get the time to get these artifacts is correlated to the size of the artifacts.
Hi again Nicolas, when our engineers tried to repro this, two artifacts with S3 references (one with 10K and one with 100K), it takes 0.5 seconds and 2 seconds, respectively using the code that is given. Is it possible for you to give us a more detailed script of what you are doing to get this lag?