Hi, Luis. I don’t know how I could miss this message. Many thanks for being willing to help.
I fixed the invalid float value issue by changing the yaml file. Instead of listing all flags in the command section, I used args_no_hyphen instead.
command:
- ${env}
- python
- run_ner_sw2.py
- ${args_no_hyphens}
BUT I do have a similar issue with the argument “-- output_dir”.
my sweep.yaml states:
.......
parameters:
... ...
output_dir:
values: ["output"]
command:
- ${env}
- python
- run_ner_sw2.py
- ${args_no_hyphens}
The error message is
2024-03-12 16:10:19,729 - wandb.wandb_agent - INFO - Agent starting run with config:
datadir: fullData
do_eval: True
do_lower_case: False
do_predict: True
do_train: True
evaluation_strategy: epoch
label_all_tokens: True
label_column_name: labels
learning_rate: 0.004748553941502207
load_best_model_at_end: True
logging_strategy: epoch
max_seq_length: 128
model_name_or_path: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext
num_train_epochs: 400
output_dir: output
pad_to_max_length: False
per_device_eval_batch_size: 32
per_device_train_batch_size: 32
save_strategy: steps
seed: 42
test_file: temp
text_column_name: words
train_file: temp
validation_file: temp
weight_decay: 0.013055535954381949
2024-03-12 16:10:19,745 - wandb.wandb_agent - INFO - About to run command: /usr/bin/env python run_ner_sw1.py datadir=fullData do_eval=True do_lower_case=False do_predict=True do_train=True evaluation_strategy=epoch label_all_tokens=True label_column_name=labels learning_rate=0.004748553941502207 load_best_model_at_end=True logging_strategy=epoch max_seq_length=128 model_name_or_path=microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext num_train_epochs=400 output_dir=output pad_to_max_length=False per_device_eval_batch_size=32 per_device_train_batch_size=32 save_strategy=steps seed=42 test_file=temp text_column_name=words train_file=temp validation_file=temp weight_decay=0.013055535954381949
run_ner_sw1.py: error: the following arguments are required: --output_dir
I tried two ways to parse the args:
Way 1 automatically update:
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments))
if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
else:
model_args, data_args, training_args = parser.parse_args_into_dataclasses()
training_args=TrainingArguments(
output_dir="output/",
report_to="wandb")
config = wandb.config
for arg in [model_args, data_args, training_args]:
for key, value in vars(arg).items():
if hasattr(config, key):
setattr(arg, key, getattr(config, key))
config_updater = ConfigUpdater(model_args, data_args, training_args)
config_updater.update_from_wandb()
Way 2: manually update. I didn’t manually specify the “outdir” argument below, cause it would cause error saying there are conflicting outputdirs.
if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
else:
model_args, data_args, training_args = parser.parse_args_into_dataclasses()
training_args=TrainingArguments(
output_dir="output/",
report_to="wandb")
config = wandb.config
# for arg in [model_args, data_args, training_args]:
# for key, value in vars(arg).items():
# if hasattr(config, key):
# setattr(arg, key, getattr(config, key))
#config_updater = ConfigUpdater(model_args, data_args, training_args)
#config_updater.update_from_wandb()
data_args.train_file = os.path.join(wandb.config.datadir, "train.csv")
data_args.dev_file = os.path.join(wandb.config.datadir,"dev.csv")
data_args.test_file = os.path.join(wandb.config.datadir, "test.csv")
# what is in train args: https://github.com/huggingface/transformers/blob/main/src/transformers/training_args.py
wandb.config.update({
"model_name_or_path": model_args.model_name_or_path,
"weight_decay": training_args.weight_decay,
"learning_rate": training_args.learning_rate,
"datadir": data_args.datadir,
"max_seq_length": data_args.max_seq_length,
"do_lower_case": data_args.do_lower_case,
"pad_to_max_length": data_args.pad_to_max_length,
"per_device_train_batch_size": training_args.per_device_train_batch_size,
"per_device_eval_batch_size": training_args.per_device_eval_batch_size,
"label_all_tokens": training_args.label_all_tokens,
"load_best_model_at_end": training_args.load_best_model_at_end,
"save_strategy": training_args.save_strategy,
"evaluation_strategy": training_args.evaluation_strategy,
"logging_strategy": training_args.logging_strategy,
"train_file": data_args.train_file,
"validation_file": data_args.validation_file,
"test_file": data_args.test_file,
"do_train": training_args.do_train,
"do_predict": training_args.do_predict,
"do_eval": training_args.do_eval,
"num_train_epochs": training_args.num_train_epochs,
"text_column_name": data_args.text_column_name,
"label_column_name": data_args.label_column_name,
"seed": training_args.seed
})
training_args.output_dir = os.path.join(wandb.config.output_dir, wandb.run.id)