Some runs are not logging my images when re-running the same code

Hey there, I encounter a problem on a regular basis that I could not yet debug successfully. Sometimes, with no regularity, my images are not logged to the wandb cloud even though they are created and saved locally in the wandb/media/images directory. All other metrics are working as expected, it is just the images!

I have made sure that they really are not logged in the cloud. I tried to re-add a media image panel to my charts, didn’t work. I looked into files in the wandb dashboard, nothing shows up except config files.

If I re-run the same code, it is like a 90%/10% chance that it will work. I suspected maybe it’s happening when I am starting two runs on the same GPU? Maybe it’s happening because of the way I call the wandb.log? Not sure.

I have quite the complex setup, which does not enable me to post a minimal working example right now. This is the code that I log the images with though:

def log_all_images(images: List[Tensor], log_key="validation", caption="Captions not set"):
    """
    Logs all images of a list as grids to wandb.

    Args:
        - images (List[Tensor]): List of images to log
        - log_key (str): key for wandb logging
        - captions (str): caption for the images
    """
    if get_rank() != 0:
        return

    assert len(images) > 0, "No images to log"

    common_size = images[0].shape[-2:]
    resizer = Resize(common_size, antialias=True)

    image_result = make_grid(images[0], nrow=4, padding=5, pad_value=0.2)
    for image in images[1:]:
        image_result = torch.concat((image_result, make_grid(resizer(image), nrow=4, padding=5, pad_value=0.2)), dim=-1)

    wandb.log({log_key: wandb.Image(image_result, caption=caption)})

I use the wandb logger with Pytorch Lightning but occasionally call the “wandb.log” instead of the “self.log_dict” (in the pl.LightningModule).

I cannot really ask for a solution to this, because I am not providing you with enough information probably. I can paste the code snippits here if you want but the underlying issue is that the SAME code produces different logging outcomes when ran multiple times. So what I would like to get help with is if there are any gotchas or any experience if this happened to someone before? What to look out for in this case?

Also, why are these images locally saved but not pushed to the cloud? Does not make sense to me.

Hi @mfeuer, here’s a couple things I would look into.

  1. Even once the main training script exits the backend wandb process may still be uploading images. Is it possible that this training environment gets shut down killing this upload process?
  2. Related to the above, are some images there for a run or none at all?
  3. Have you looked through the logs/debug-internal.log file from the local run folder? I’m happy to take a look if you want to post it here.
  4. It looks like you might be using DDP or some other distributed training framework. Are you starting a run on each rank or only on the rank 0 process?

Hey @nathank,

  1. This should not happen. Even when I am stopping the training (ctrl + c), the wandb process cleans up and finishes correctly.
  2. none at all…
  3. interestingly, the latest run where this happened there is no log folder. It is just the files directory in the run directory… edit: also in the files there is just images, nothing more
  4. it is in the code but I currently do not actively use this. It is just implemented for when I scale the model in the future

I have two parallel runs currently. I looked into their folder structure and noticed something that is different, maybe this helps debugging.

The run that is correctly logging images has links for logs and latest run. The run not logging any images does not have any of the sorts.

They are both at the time of the screenshot actively training. Is this normal?

Hey @nathank,

here is some more information I just collected. When stopping the training manually I get this output:

wandb: Waiting for W&B process to finish... (success).
wandb:
wandb: Run history:
wandb:                      epoch ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                lr-Adam/pg1 ▁
wandb:       lr-Adam/pg1-momentum ▁
wandb:                lr-Adam/pg2 ▁
wandb:       lr-Adam/pg2-momentum ▁
wandb:        pyramid_loss_step_1 ▇█▄█▅▄▄▄▄▄▆▄▃▆▄▇▃▅▅▃▄▆▄▃▄▃▂▃▃▄▃▅▄▂▁▃▁▄▆▅
wandb:        pyramid_loss_step_2 ██▄█▆▄▄▄▄▄▆▄▃▅▄▇▃▅▄▃▄▆▄▃▄▄▂▂▃▄▄▄▄▂▁▃▁▃▅▄
wandb:        pyramid_loss_step_3 ██▄█▅▅▄▄▃▄▅▄▂▅▄▇▄▅▄▃▄▆▄▃▃▄▃▂▃▃▄▄▃▂▁▃▁▃▄▄
wandb:        pyramid_loss_step_4 ▇█▄█▄▅▄▄▃▄▄▄▂▄▃▆▄▅▃▃▃▆▃▄▃▄▃▂▃▂▄▄▂▂▁▄▂▂▂▃
wandb:                 train_loss ▄▆▇▇█▅▆▅▄▅▆█▆▃▆▅▄▅▇▆▅▅▄▅▄▅▅▅▁▃▃▃▄▄▄▄▄▄▆▅
wandb:          train_recons_loss ▄▆▇▇█▅▆▅▄▅▆█▆▃▆▅▄▅▇▆▅▅▄▅▄▅▅▅▁▃▃▃▄▄▄▄▄▄▆▅
wandb: train_stop_prediction_loss ▅▅▅▄▆▆▇▄▂▄▄▄▄▁█▆▅▄▄▂▆▅▃▆▆▄▅▃▄▅▂▄▄▄▃▆▃▆▂▆
wandb:        trainer/global_step ▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
wandb:
wandb: Run summary:
wandb:                      epoch 0
wandb:                lr-Adam/pg1 0.002
wandb:       lr-Adam/pg1-momentum 0.9
wandb:                lr-Adam/pg2 1e-05
wandb:       lr-Adam/pg2-momentum 0.9
wandb:        pyramid_loss_step_1 0.03063
wandb:        pyramid_loss_step_2 0.02753
wandb:        pyramid_loss_step_3 0.01551
wandb:        pyramid_loss_step_4 0.006
wandb:                 train_loss 0.10062
wandb:          train_recons_loss 0.10063
wandb: train_stop_prediction_loss 0.09942
wandb:        trainer/global_step 309
wandb:
wandb: 🚀 View run VectorGPTv2, line primitive test at: https://wandb.ai/test/test/runs/xeijtu7j
wandb: ️⚡ View job at https://wandb.ai/test/test/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjc2OTc3MTA0/version_details/v167
wandb: Synced 6 W&B file(s), 0 media file(s), 2 artifact file(s) and 0 other file(s)
wandb: Find logs at: /tmp/wandb/run-20231026_102254-xeijtu7j/logs

So there I finally got the information where the log file is located. Also you see that 0 media files have been synced.

Here is the first few lines of content of the debug-internal.log file (cannot upload whole file due to length limits):

2023-10-26 10:27:43,765 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:43,988 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:27:44,239 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:44,239 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:44,240 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_187_d1781b841a5da6457297.png with policy now
2023-10-26 10:27:44,240 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:44,240 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:44,240 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:44,243 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:44,246 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:44,246 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:44,246 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:44,267 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:44,267 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:44,268 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_189_78beb79ea5d1bf313161.png with policy now
2023-10-26 10:27:44,268 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:44,268 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:44,268 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:44,290 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:44,290 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:44,290 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_190_6e1aaef6e1d9d230d1da.png with policy now
2023-10-26 10:27:44,291 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:44,291 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:44,291 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:44,988 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:27:44,988 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:46,250 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:27:46,989 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:48,989 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:50,990 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:51,207 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:51,207 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:51,207 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:51,208 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:51,760 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:51,761 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:51,761 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_192_5559aac4765096d41b17.png with policy now
2023-10-26 10:27:51,761 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:51,761 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:27:51,762 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:51,762 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:51,765 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:51,765 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:51,765 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:51,765 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:51,789 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:51,789 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_194_d5f2fc443666544e531c.png with policy now
2023-10-26 10:27:51,789 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:51,790 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:51,790 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:51,790 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:51,812 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:51,812 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:51,813 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_195_b9420d2d7481b785b0e6.png with policy now
2023-10-26 10:27:51,813 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:51,813 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:51,813 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:51,990 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:27:52,991 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:54,973 DEBUG   SenderThread:34180 [sender.py:send():380] send: stats
2023-10-26 10:27:54,991 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:56,992 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:57,174 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:27:57,435 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: stop_status
2023-10-26 10:27:57,436 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: stop_status
2023-10-26 10:27:57,439 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: internal_messages
2023-10-26 10:27:58,696 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:58,697 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:58,697 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:58,697 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:58,993 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:27:58,993 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:27:59,199 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:59,199 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:59,199 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_197_5c3403ac81beaa635da0.png with policy now
2023-10-26 10:27:59,200 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:59,200 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:59,200 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:59,202 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:59,203 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:59,203 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:59,203 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:59,226 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:59,226 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:59,227 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_199_6e58ac147cc77021ee27.png with policy now
2023-10-26 10:27:59,227 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:59,227 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:59,227 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:59,250 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:27:59,251 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:27:59,251 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_200_52b5cf0f06e13b73ff44.png with policy now
2023-10-26 10:27:59,252 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:27:59,252 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:27:59,252 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:27:59,993 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:28:00,993 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:02,654 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:28:02,994 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:04,995 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:06,051 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:06,052 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:06,052 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:06,052 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:06,610 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:06,610 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:06,610 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_202_9e01e40a2f053f5a77c3.png with policy now
2023-10-26 10:28:06,610 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:06,611 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:06,611 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:06,614 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:06,615 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:06,615 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:06,615 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:06,638 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:06,638 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:06,638 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_204_e8c93907656f7f2a84cb.png with policy now
2023-10-26 10:28:06,638 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:06,639 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:06,639 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:06,662 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:06,662 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:06,663 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_205_8959a236b2c37c76c077.png with policy now
2023-10-26 10:28:06,663 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:06,663 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:06,663 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:06,995 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:28:06,996 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:08,630 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:28:08,996 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:10,996 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:12,435 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: stop_status
2023-10-26 10:28:12,436 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: stop_status
2023-10-26 10:28:12,439 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: internal_messages
2023-10-26 10:28:12,997 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:13,480 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:13,481 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:13,481 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:13,481 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:13,980 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:13,981 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:13,981 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:13,981 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_207_c9d18ccb33ecffa8c706.png with policy now
2023-10-26 10:28:13,982 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:13,982 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:28:13,982 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:13,982 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:13,983 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:13,983 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:13,983 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:13,997 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:28:14,005 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:14,005 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:14,006 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_209_9e075fcc4d2d3ffae820.png with policy now
2023-10-26 10:28:14,006 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:14,006 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:14,006 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:14,029 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:14,029 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:14,029 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_210_99cce92ce83e46248cc8.png with policy now
2023-10-26 10:28:14,037 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:14,037 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:14,038 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:14,998 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:28:14,998 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:16,998 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:18,999 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:19,040 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: status_report
2023-10-26 10:28:20,952 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:20,952 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:20,952 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:20,953 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:21,000 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/wandb-summary.json
2023-10-26 10:28:21,000 INFO    Thread-12 :34180 [dir_watcher.py:_on_file_modified():288] file/dir modified: /tmp/wandb/run-20231026_102254-xeijtu7j/files/output.log
2023-10-26 10:28:21,480 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:21,481 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:21,481 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/pyramid loss_212_b0aa2429ca1956e41545.png with policy now
2023-10-26 10:28:21,481 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:21,481 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:21,482 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:21,484 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:21,485 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:21,485 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:21,485 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:21,508 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:21,508 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:21,508 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/input (left) vs. prediction (right)_214_2d310104c22039d31c13.png with policy now
2023-10-26 10:28:21,509 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:21,509 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:21,509 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end
2023-10-26 10:28:21,532 DEBUG   SenderThread:34180 [sender.py:send():380] send: files
2023-10-26 10:28:21,532 DEBUG   HandlerThread:34180 [handler.py:handle_request():144] handle_request: partial_history
2023-10-26 10:28:21,532 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file media/images/training predictions_215_1efe15ca5a40885240b5.png with policy now
2023-10-26 10:28:21,533 DEBUG   SenderThread:34180 [sender.py:send():380] send: history
2023-10-26 10:28:21,533 DEBUG   SenderThread:34180 [sender.py:send_request():407] send_request: summary_record
2023-10-26 10:28:21,534 INFO    SenderThread:34180 [sender.py:_save_file():1378] saving file wandb-summary.json with policy end

In the debug log you can see that there are .png files saved.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.