Is there a way to only update specific parts of a Table?

carloshernandezp · October 27, 2021, 9:35am

Hello Everyone! (Long time user, first time poster )

I just started using WandB tables to log my predictions alongside their input images. It has been very useful so far. My problem arises when I run my code on a cluster we have at my university.

For simplicity, let’s say that every epoch, I am logging a table with the following columns: [id, Image, prediction] which is a list of [string, wandb.Image, int].

Every time wandb.Image() is called, it saves the image employing the PIL library (can be seen below). My problem arises when, after a certain number of epochs, I run into memory problems:

 Traceback (most recent call last):
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/PhD/2021/marato-derma/derma/sol/cnn_recommendations/processing/train_utils.py", line 206, in log_wandb_table
    row = [img_id, wandb.Image(image),
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/wandb/sdk/data_types.py", line 1587, in __init__
    self._initialize_from_data(data_or_path, mode)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/wandb/sdk/data_types.py", line 1700, in _initialize_from_data
    self._image.save(tmp_path, transparency=None)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/PIL/Image.py", line 2102, in save
    save_handler(self, fp, filename)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 900, in _save
    ImageFile._save(im, _idat(fp, chunk), [("zip", (0, 0) + im.size, 0, rawmode)])
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/PIL/ImageFile.py", line 511, in _save
    fp.write(d)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 748, in write
    self.chunk(self.fp, b"IDAT", data)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/site-packages/PIL/PngImagePlugin.py", line 735, in putchunk
    fp.write(data)
OSError: [Errno 28] No space left on device

I was wondering if Tables allow only to update specific columns. The id and Image remain constant for the entire training, the only values that I am interested in their evolution are the predictions.

Has anyone encountered a similar problem? Any ideas on how to shortcut my lack of memory would be welcome!

cayush · October 27, 2021, 11:45am

Hi @carloshernandezp , thanks for asking this question. I think you can prevent this from happening by logging the table with actual data once and using the reference to already logged data to build the table for the next iteration.
Here’s a snippet for more clarity.


# define the main table 
evalset_table = None

def log_new_table():
       # initialize new table
       table = ... # ["image", "id"]
       for i, img in enumerate(loader):
          if not evalset_table: 
              # add images if evalset table isn't initialized
         else:
             # use reference to evalset table if it is already logged
             table.add_data(evalset_table.data[i], i)
         
         # log this table as evalset is not logged already. 
       if evalset_table is None:
            eval_art = wandb.Artifact(_run.id + table_name, type="dataset")
            eval_art.add(table, "evalset")
            _run.use_artifact(eval_art)
            evalset_table= eval_art.get("evalset")

Let me know if something isn’t clear

carloshernandezp · October 27, 2021, 1:48pm

Hello @cayush, thanks for the quick response.

I understand how using the reference to the already logged data will solve my issue. But two questions come to my mind.

Firstly, in the line table.add_data(evalset_table.data[i], i) did you mean to type ...(evalset_table.data[i], img) implying that at position i you add the data of img. A

Secondly, how is the data at evalset_table updated in wandb? The way I was updating the data was through a wandb.Table that I would then log with wand.run.log. Is it uploaded automatically through changing the values in table.add_data(...)?

Here is a simplified version of what I am doing so far:

table = wand.Table(columns=columns) # columns are [id, Image, prediction]

for i, id, img, pred in enumerate(...): #omitting what I'm iterating for simplicity
      row = [id, img, pred]
      table.add(*row)

So, as far as I understand I should change the iteration to:

table = wand.Table(columns=columns) # columns are [id, Image, prediction]

for i, id, img, pred in enumerate(...): #omitting what I'm iterating for simplicity
      row = [id, img, pred]
      table.add_data(evalset_table.data[i], *row)

cayush · October 27, 2021, 3:02pm

@carloshernandezp yes you’re right about point 1. I was just using that as example.
for point 2, you’ll still need to log the tables using run.log{name: table}. It just that the tables that use the reference to other tablets won’t upload the images again. Just add run.log{name: table} at the end of the loop

carloshernandezp · October 27, 2021, 4:35pm

@cayush Gotcha, I think I am almost there:

So far my code is looking like this:


def log_new_table():
       # initialize new table
       table = wand.Table(columns=columns) # columns are [id, Image, prediction]
       for i, id, img, pred  in enumerate(loader):
         row = [id, img, pred] 
         if not evalset_table: 
              # add images if evalset table isn't initialized
             table_add_data(*row)
         else:
             # use reference to evalset table if it is already logged
             evalset_table.data[i] = *row
             table.add_data(*evalset_table.data[i])       #Mark 1
         
         # log this table as evalset is not logged already. 
       if evalset_table is None:
            eval_art = wandb.Artifact(wand.run.id + table_name, type="dataset")
            eval_art.add(table, "evalset")
            wand.run.use_artifact(eval_art)
            eval_art.wait() # Without this line the code broke
            evalset_table= eval_art.get("evalset")
            wandb.run.log({'Evaluation table' : evalset_table}) # Mark2

The change in #Mark1 compared to your snippet is due to the size of the table. Therefore, I needed to give something of the same length. Also, in Mark2, I logged the evalset_table, but I am unsure that this is needed.

I have run some training logging things for 30 minutes, and I have not had a space issue. However, I do see that the every few epochs, some errors pop up regarding the /tmp files, such as:

Traceback (most recent call last):
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/weakref.py", line 548, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/tempfile.py", line 938, in _cleanup
    _rmtree(name)
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/shutil.py", line 477, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/mnt/gpid07/imatge/carlos.hernandez/Documents/base/lib/python3.6/shutil.py", line 475, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmprh45rrv6'
Exception ignored in: <finalize object at 0x7f77a6a096d0; dead>

It does not stop execution, so I don’t expect to solve it on this thread, but it baffled me as I had never seen this error.

cayush · October 27, 2021, 4:37pm

carloshernandezp:

         # log this table as evalset is not logged already. 
       if evalset_table is None:
            eval_art = wandb.Artifact(wand.run.id + table_name, type="dataset")
            eval_art.add(table, "evalset")
            wand.run.use_artifact(eval_art)
            eval_art.wait() # Without this line the code broke
            evalset_table= eval_art.get("evalset")
            wandb.run.log({'Evaluation table' : evalset_table}) # Mark2

You’ll need to change the last line. You don’t need to call .log if you’ve already called use_artifact . just call .log outside the scope of if statement.


      if evalset_table is None:
            eval_art = wandb.Artifact(wand.run.id + table_name, type="dataset")
            eval_art.add(table, "evalset")
            wand.run.use_artifact(eval_art)
            eval_art.wait() # Without this line the code broke
            evalset_table= eval_art.get("evalset")
      wandb.run.log({'Evaluation table' : table}) # Mark2

carloshernandezp · October 27, 2021, 4:42pm

I forgot to add the wandb.run.log({'Evaluation table' : table}) # Mark2 line on my last reply.

As far as I understand, the solution comes from adjusting the information inside evalset_table as it is linked to our table by means of adding it to the artifact.

Thank you very much the time you took to solve my problem.

cayush · October 27, 2021, 4:45pm

Sure no problem :). Were you able to solve this?

carloshernandezp · October 27, 2021, 4:47pm

I think so!

I have left some CNN training for a while to see if it breaks at some point while logging. Normally it stopped sooner than the current run so I am confident it is solved.

cayush · October 27, 2021, 4:51pm

Awesome. If this is a public project, I’d love to see the dashboard

system · April 20, 2022, 6:02pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Join over different tables in a run W&B Help tables , wandb	3	1105	March 12, 2023
wandb.Table does not update properly W&B Help wandb	8	689	September 24, 2024
Memory limit when uploading a image dataset as table W&B Help artifacts	6	139	May 7, 2024
Collab example for building an "evaluation" table using wandb.log() W&B Help	4	498	April 20, 2022
How can I update custom plots in real-time? W&B Help	6	2320	April 20, 2022

Is there a way to only update specific parts of a Table?

Related topics