Adding tfRecords files to artifacts doesn't work?

Hello,

I am trying logging my tfRecords files to artefact, but it seems to not be working (I get an error: “wandb: Network error (TransientError), entering retry loop.”).

I am providing the code I use below. I am pretty sure it is something regarding the tfRecords file since I tried changing the contents of my folders to contain only .csv and .paqruet and it worked nicely. Do you have any ideas what could be happening here?

with wandb.init(project="----", entity='----', job_type='saving_processed_files') as run:
    train_data_art = wandb.Artifact(
        name='train_data',
        type='train_data'  
    )

    files_train = os.listdir(final_path_train)
    files_train=[x  for x in files_train if x[0]!='.']

    for file in files_train:
        file_path = os.path.join(final_path_train, file)
        train_data_art.add_file(file_path, name=file)

    run.log_artifact(train_data_art)

Hi @drlje,

Is this error still popping up with TF Record files? Usually TransientErrors are minor network issues that automatically get resolved after a while.

Please let me know if this error still persists and I will dig in further into what might be happening here in that case.

Thanks,
Ramit

Hi @drlje,

We wanted to follow up with you regarding this issue as we have not heard back from you. Please let us know if we can be of further assistance or if your issue has been resolved.

Thanks,
Ramit

Hi @drlje,

Since we have not heard back from you, I am closing out this request. If you would like to re-open this conversation, please let us know!

Thanks,
Ramit

Hello @ramit_goolry ,

Sorry for not being prompt. I tried again and I got the same error. However, I also tries doing this on a small fraction of data (also saved as a TFRecords) and it went through. So I am guessing this has something to do with the size - the total size of my files is around 10gb. Do you think that could be the issue?

Thanks a lot!

Marin

Hi @ramit_goolry,

Do you have a feedback regarding the limit size of the files being uploaded? We are thinking to upgrade our account and this issue is really important for us.

Thanks!

Marin

Hi,

Could you please re-look at this issue - I have left additional comment a while ago, and in a couple of days we will face this issue again, so I would love to get to the bottom of it.

https://community.wandb.ai/t/adding-tfrecords-files-to-artifacts-doesnt-work/1948/6

Thanks!

Marin

Hi @drlje,

I’m sorry about not responding here sooner. I’m sorry to hear you are still facing this issue, and I will definitely assist you here. You said you are seeing an error with a lot of data : Could you share a little bit more information about the structure of this data and the behavior you see? More specifically:

  • How many files do you have?
  • Are you seeing a lot of time delay before these errors show up?
  • Could you try uploading this same scale of information but using some other file format? (like a set of .txt files)

Additionally, the debug.log and debug-internal.log files associated with the run where you are facing this issue would be highly appreciated since it would give us some more visibility into this issue.

Thanks,
Ramit

Hi @ramit_goolry,

In reproducing the issue today, the artifact was saved without any problems; so I guess the issue can be closed. If we experience the same problematic again, I will follow the steps above and supply you with the log files.

Many thanks!

Best,

Marin