I am trying to upload examples of my LLM during training to understand how well it is doing.
Here is an overview of what I am doing:
import wandb
data = [] # create fake data per epoch
for i in range(5):
data.append([[f"in1-{i}", f"out1-{i}", f"{i}"], [f"in2-{i}", f"out2-{i}", f"{i}"]])
wandb.init(project="Learn-Table", job_type="train", config={"seed": 1})
table = wandb.Table(columns=["Input", "Output", "Index"])
for i, d in enumerate(data):
[table.add_data(*x) for x in d]
wandb.log({"table": table})
wandb.finish()
However, this does not work. In the local wandb
logging folder the run/files/media/table/table_0_.....table.json
only the first row is visible. So it does not seem to re-log when data is added:
{"columns": ["Input", "Output", "Index"], "data": [["in1-0", "out1-0", "0"], ["in2-0", "out2-0", "0"]]}
Online the same table sometimes shows twice, sometimes thrice. It contains the first table.
Alternative Idea
Recreate a new table everytime. This is NOT what I want. I want one table that gets appended every epoch.
import wandb
data = [] # create fake data per epoch
for i in range(5):
data.append([[f"in1-{i}", f"out1-{i}", f"{i}"], [f"in2-{i}", f"out2-{i}", f"{i}"]])
wandb.init(project="Learn-Table", job_type="train", config={"seed": 1})
for i, d in enumerate(data):
table = wandb.Table(data=d, columns=["Input", "Output", "Index"])
wandb.log({"table": table})
wandb.finish()
This also doesn’t work either. Locally, I see 5 different tables now in the logs. Online I see the same table twice or thrice, again. However, this time the last entry.
Conclusion
This all seems odd. Why is the same table shown multiple times? Why are tables in local logs that are not online?
Ideally, I would like to have one table that gets appended regularly. How do I do that?