WandbCallback of fastai crashing the colab session!

I am trying to fine-tune the ULMFit language model using fastai library.

For logging, I want to use wandb.

To implement wandb in the model, I used WandbCallback.

But, it always crashes the colab session at the end of training of an epoch.

Initially, I thought the session is getting crashed due to a memory issue but when I used the same model with the same data but without wandb callback, it ran successfully.

Below is the code that I have used.

from fastai.text.all import *

files = get_text_files('digital_marketing_data')
# digital_marketing_data is a folder that contains text files.

# Here's how we use TextBlock to create a language model, using fastai's defaults:

get_db = partial(get_text_files)

dls_lm = DataBlock(
    blocks=TextBlock.from_folder('digital_marketing_data', is_lm=True),
    get_items=get_db, splitter=RandomSplitter(0.1)
).dataloaders('digital_marketing_data', path='digital_marketing_data', bs=64//2, seq_len=100)

import wandb
from fastai.callback.wandb import *
import os

wandb.login()

# Initializing a wandb run
wandb.init(project='ulmfit_digital_marketing_finetune', name='Default Param with Minimum LR')

# Model
cp_name = 'model_with_minimum_lr'
learn = language_model_learner(
    dls_lm, AWD_LSTM, drop_mult=0.3, cbs=[GradientAccumulation(n_acc=64),WandbCallback(),SaveModelCallback(fname=cp_name, every_epoch=True, with_opt=True)],
    metrics=[accuracy, Perplexity()]).to_fp16()

When I ran the same code without WandbCallback(), then it ran successfully, but with WandbCallback() it always crashes the colab session.

Hi @sumit-wnb ,

We will take a look at this for you. Please provide a link to your workspace where you are experiencing this crash and debug logs associated with the run that is crashing. Logs can be found in the wandb folder in colab and will be within the folder sharing the run name.

Thanks,
Mohammad

Hi,

Thanks for the reply.
But I got the solution, which worked.

So, WandbCallback(log_preds=False) with log_preds as False solves this issue.

Hi @sumit-wnb , thank-you for updating us that you were able to resolve this issue. We will mark this closed.

Regards,
Mohammad

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.