Updating Tables/Artifacts in Lightning CLI with wandb

Hey, I am using this code to log test results of my experiments, I want to update the table everytime I call this function (in multiple runs) however, the artifact is not updated, what do I need to do? (The Tables are successfully retreived and updated)

    artifact_name = "results-table"
    table_name = "Test/Results"

    try:
        api = wandb.Api()

        project_name = "myproject"
        entity_name = "myentity"
        run_name = "myrun"
        run_path = f"{entity_name}/{project_name}/{run_name}"
        run = api.run(run_path)

        # Retrieve artifacts logged by this run
        artifacts = run.logged_artifacts(per_page=50)  # Adjust 'per_page' as needed

        # Find the latest version of a specific type of artifact
        latest_artifact = None
        for artifact in artifacts:
            if not latest_artifact or artifact.updated_at > latest_artifact.updated_at:
                latest_artifact = artifact

        table = latest_artifact.get(table_name)
        print("Retrieved Artifact")

    except:
        print("Create new table")
        # If the artifact doesn't exist, create a new table
        columns = ["ckpt", "acc yt", "acc df", "auc df", "acc nt", "auc nt", 
                "acc f2f", "auc f2f", "acc fs", "auc fs", "acc fsh", "auc fsh", 
                "acc yr", "acc cr", "acc cs", "auc cdf"]
        table = wandb.Table(columns=columns)

    ckpt = self.trainer.ckpt_path.rpartition("-")[-1].rpartition("=")[0]
    self.test_row = [ckpt] + self.test_row

    table.add_data(*self.test_row)

    # Create a new artifact for logging
    new_artifact = wandb.Artifact(artifact_name, type='dataset')
    new_artifact.add(table, table_name)

    # Increment version or manage versions here
    self.logger.experiment.log_artifact(new_artifact)

Hello, This a reply from our support bot designed to assist you with your Weights & Biases related queries. To reach a human please reply to this message.

To update an existing artifact (such as a table) across multiple runs in Weights & Biases, you need to follow a process that allows you to retrieve the existing artifact, update it, and then log the updated version. However, based on the code snippet you’ve provided, it seems like you’re trying to retrieve and update the artifact within the same run, which might not be the intended workflow for updating artifacts across runs.

Here’s a revised approach to update a table artifact across multiple runs:

  1. Retrieve the Latest Version of the Artifact: Use the wandb.Api() to retrieve the latest version of the artifact by its name.
  2. Download the Artifact’s Table: Once you have the artifact, download the table it contains.
  3. Update the Table: Add new data to the table.
  4. Log the Updated Table as a New Artifact Version: Create a new artifact, add the updated table to it, and log it to W&B.

Here’s how you can implement this:

import wandb

# Initialize your W&B run
wandb.init(project="myproject", entity="myentity")

artifact_name = "results-table"
table_name = "Test/Results"

api = wandb.Api()
project_path = "myentity/myproject"

# Try to retrieve the latest version of the artifact
try:
    artifact = api.artifact(f"{project_path}/{artifact_name}:latest")
    table = artifact.get(table_name)
    print("Retrieved Artifact")
except:
    print("Artifact not found, creating a new table")
    # Define your columns here
    columns = ["ckpt", "acc yt", "acc df", "auc df", "acc nt", "auc nt",
               "acc f2f", "auc f2f", "acc fs", "auc fs", "acc fsh", "auc fsh",
               "acc yr", "acc cr", "acc cs", "auc cdf"]
    table = wandb.Table(columns=columns)

# Assuming self.test_row is defined elsewhere and contains the data to add
ckpt = "your_ckpt_value"  # Example checkpoint value
test_row = [ckpt] + ["your_test_data"]  # Replace "your_test_data" with actual data
table.add_data(*test_row)

# Create a new artifact for logging, using a new version
new_artifact = wandb.Artifact(artifact_name, type='dataset')
new_artifact.add(table, table_name)

# Log the artifact
wandb.log_artifact(new_artifact)

Key Points:

  • Ensure you have the correct project and entity names.
  • When you log the updated artifact, W&B automatically versions it for you. You don’t need to manually increment the version.
  • If the artifact doesn’t exist (e.g., on the first run), the code creates a new table and logs it as a new artifact.
  • Make sure to replace "your_ckpt_value" and "your_test_data" with actual data from your experiments.

This approach ensures that each time you run this code, it attempts to retrieve the latest version of the artifact, updates it with new data, and logs the updated version, maintaining a version history of your test results across runs.

To reach a human please reply to this message.

-WandBot :robot:

Up until table.add_data(…) everything seems to work, I can retrieve the previous artifact and extract the table and update it. After that I cannot get the artifact to update the table. I tried logging both the table or the artifact with lightnings wandb instance or by calling wandb.init again. None of the methods I tried result in the upload of an updated artifact / table.

Hi @arianb Good day and thank you for reaching out to us. Happy to help you on this!

Let me review this and investigate and see what we can suggest here.

I know that there has been some enhancements with regards to artifacts in wandb on our recently released versions but I will check this behavior that you are experiencing. For the meantime, could you please share us your current SDK version? You can get this by running wandb --version