Unable to delete artifacts

Hi,
I have some runs that take up 20GB of storage, but when I go to the storage managing page, I am only able to see a fraction of the weight. I would like to keep the images and metrics logged by that run and remove everything. I assume the rest of the non-displayed storage are artifacts.

I saw that artifacts can be removed using:

import wandb
api = wandb.Api()
run = api.run('mrna/NewSota/4i5o0gzh')
for artifact in run.logged_artifacts():
    artifact.delete()
    #   artifact.delete(delete_aliases=True)   # This also gives the same error

However, when I do so I get a non-descriptive error:


Traceback (most recent call last):
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/apis/normalize.py", line 41, in wrapper
    return func(*args, **kwargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/apis/public.py", line 4629, in delete
    self.client.execute(
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/sdk/lib/retry.py", line 212, in wrapped_fn
    return retrier(*args, **kargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/sdk/lib/retry.py", line 131, in __call__
    result = self._call_fn(*args, **kwargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/apis/public.py", line 252, in execute
    return self._client.execute(*args, **kwargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute
    result = self._get_result(document, *args, **kwargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result
    return self.transport.execute(document, *args, **kwargs)
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/sdk/lib/gql_request.py", line 56, in execute
    request.raise_for_status()
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.wandb.ai/graphql
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-ae6f0e9f1222>", line 6, in <cell line: 4>
    artifact.delete()
  File "/home/nil/miniconda3/envs/reg2/lib/python3.10/site-packages/wandb/apis/normalize.py", line 51, in wrapper
    raise CommError(message, error)
wandb.errors.CommError: cannot delete system managed artifact (Error 400: Bad Request)

What does this error mean?
Alternatively, how do I remove the remaining storage associated with my jobs that is not displayed on the “storage managing” page on the browser?

Hi @nilstoltanso happy to help. I do agree the error is non descriptive enough to help identify the reason artifacts can’t be deleted. When you log a run to wandb, we also log a run-history artifact parquet file that storage your entire history metrics. By design we disallow users from deleting this artifact. The only way to delete it is to delete the entire run. To bypass this add a conditional statement

for artifact in run.logged_artifacts():
    if artifact.type!="wandb-history":
        artifact.delete()

As for the second part of your question regarding remaining storage, could you expand on this? Do you have a link to a project highlighting the data you can’t delete. Thanks

Hi @mohammadbakir ,
Thanks for the reply. We unfortunately had to delete the two runs that were causing issues because of their size, so I can’t provide a link. But in short, we kept getting a notification that our storage was full because two of our runs were 55GBs and 20GBs. However, when we went into the project’s storage manager page, expanding either of the projects would only show a “media” folder with <5GBs of images, the other 65+GBs were seemingly not discoverable from the storage manager. We had to settle for manually downloading a handful of the logged images we wanted to keep and deleting the runs altogether.

Also, thank you for the code snippet, I will make sure to give it a try next time we face the issue.

Hi @nilstoltanso , thank you for the update. Please note we are re-working our storage page to best reflect size of data. Currently the storage page parent folders are sized based on the first 1000 files of that folder, which is why you were potentially seeing discrepancies. In addition to the above api call to delete artifacts, Here is an example script below of how to delete all media folders from a project. You can tailor this to your specific requirements. I would recommend first testing out on a single projects of choice when looping through projects, check for project name or modify to just query runs of that project Import & Export Data to W&B

import wandb

# Initialize the W&B API client
api = wandb.Api()

# Replace <entity> with your W&B entity
entity = "<entity>"

# Get all projects for the entity
projects = api.projects(entity)

# Loop through all projects
for project in projects:
    # Get all runs for the current project
    runs = api.runs(f"{entity}/{project.name}")
    
    # Loop through all runs
    for run in runs:
        # Get all files for the current run
        files = run.files()
        
        # Loop through all files
        for file in files:
            # Check if the file is in a Media folder
            if file.name.startswith("media/"):
                # Delete the file
                file.delete()
                print(f"Deleted {file.name} from run {run.name} in project {project.name}")

print("Finished deleting Media folders and their files from all runs.")

Will mark resolved but please do reach out again anytime we could be of help.