So… I found out myself during drafting this, but keep it here for reference/documentation. It’s a little thing, maybe of help to someone.
My train.py reads arguments from the command line and I want to use wandb sweeps.
One of the arguments is --config_file overwrite.args. I can pass this to train.py and process it with ArgumentParser() and HfArgumentParser() (from Hugging Face). If I use it as part of a sweep config, like so:
command:
- python3
- ${program}
- --config_file overwrite.args
- ${args}
program: train.py
… train.py is called as train.py "--config_file overwrite.args" (with quotes). This is represented differently in sys.argv and function calls like this treat it differently as well:
config_file_parser = argparse.ArgumentParser()
config_file_parser.add_argument(
"config_file", type=str, action="append"
)
known_args, remaining_args = config_file_parser.parse_known_args()
Without quotes, it’s a known_args, with quotes it’s a remaining_args. Additionally, at some point, the HfArgumentParser()throws an error ValueError: Unknown configuration arguments with the quoted argument.
Therefore, my question/request: (how) can I get rid of the added quotes?
And the simple answer is: change the sweep config to put the flag/argument on two lines, like so:
command:
- python3
- ${program}
- --config_file
- overwrite.args
- ${args}
program: train.py