Hi everyone, I am using wandb with Huggingface in a AWS Sagemaker notebook and I am refering to the tutorial here: Hugging Face Transformers | Weights & Biases Documentation.
I tried to set the WANDB_PROJECT
environment variable before setting up the huggingface_estimator
, which will call train.py
.
train.py
is where I initialize the Trainer
. The above tutorial mentions to make sure to set the project name before initializing the Trainer
, and I think I am doing this correctly here.
Here are some useful snippets of my code.
import wandb
wandb.login()
WANDB_PROJECT=my_project_name
...
huggingface_estimator = HuggingFace(
image_uri=image_uri,
entry_point='train.py',
source_dir='./scripts',
instance_type='ml.g4dn.xlarge',
instance_count=1,
role=role,
py_version='py39',
hyperparameters=hyperparameters,
)
train.py
training_args = TrainingArguments(
output_dir=args.output_dir,
per_device_train_batch_size=args.per_device_train_batch_size,
num_train_epochs=args.epochs,
learning_rate=args.learning_rate,
save_strategy="epoch",
logging_strategy='epoch',
report_to="wandb",
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
data_collator=collate_fn,
tokenizer=image_processor,
)
trainer.train()
I would greatly appreciate any guidance or advice on how to resolve this issue. Thank you very much in advance for your help!