I see many examples in the documentation for logging actual files as datasets/artifacts, but how do I log datasets that aren’t files? For example, I am using tensorflow_datasets to download my dataset directly into train and validation splits and would like to log these directly. Is there an easy way to do this or can they only live in a table object?
Hi @stevencockewandb.log() function, accepts a variety of data types including NumPy arrays, Python dictionaries, and other data structures. Are you looking to do something similar to this?
import tensorflow_datasets as tfds
import wandb
# Initialize WandB
wandb.init(project="tf-data-test")
# Load MNIST dataset
ds_train, ds_test = tfds.load('mnist', split=['train[:20]', 'test[:20]'], shuffle_files=True)
# Create a WandB Table
table = wandb.Table(columns=["image", "label"])
# Log examples to WandB & add data to table
for example in ds_train:
image = example['image'].numpy()
label = example['label'].numpy()
wandb.log({"image": wandb.Image(image, caption=f"Label: {label}")})
table.add_data(wandb.Image(image), label)
# Log the table to WandB
wandb.log({"mnist": table})
Hi @stevencocke, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!