Hello,
I was playing around with W&B to tune the hyperparameters. Let’s say I am using YOLOv8 to fine-tune a custom dataset (Visdrone dataset). Assume the dataset is well-preprocessed. Here is what I did:
First, I trained the dataset with YOLO without knowing the initial losses and metrics for 100 epochs and obtained the results. I observed that losses like box, classification, and DFL were not minimal, ranging between 0.7 and 2.0. I also observed overfitting.
Then, I tried W&B Sweep to tune the previously trained model for hyperparameter tuning with minimal parameters like initial learning rate, batch size, and momentum. After training for 25 epochs, I was unable to get good values for losses and metrics. So, I tried several parameters, including augmentation. Sometimes, this also ended with overfitting. My questions are:
I) Did I follow the correct process?
II) Why am I unable to get a good model through hyperparameter tuning?
III) I also used Ray Tune, and it gives different results for the same initial values. with wandb How is that possible?
I am really confused about getting a good model. Will this approach be the same if I use Keras, TensorFlow, or PyTorch? For this experiment, I just used Ultralytics.