Having just finished a paper where I found W&B an integral tool, I thought I’d summarize my thoughts while they’re still fresh.
W&B is awesome, and made what would have been a implausible project manageable.
Double Charles’ salary and get that man to produce 2-3x times more tutorials/streams/walk throughs. His hp-sweep tutorials is what really pushed me over the edge to start using W&B, and integrate it into the rest of my project. We could definitely use more tutorials on scaling projects across many machines (which isn’t super hard once I got it working with YAML config, but was with python api), walking through typical real ML project (tuning proposed model, tuning baseline comparisons, walk through web interface on how to gain insight into the runs, how to organize all the runs together coherently, how to use stored checkpoint to reload model and run when the reviewers bash your paper and want more experiments, etc…).
Criticism: I have found the python api to be very unstable, particularly when moving across platforms. I believe it to be the cause of crashed/stalled GPUs (although I’m not sure how, setup below ***) when running hyperparameter sweeps. YAML configurations and CLI usage is by far more stable, scalable, and better integrated into the web interface. I think this needs to be better communicated in the documentation and tutorials. Many of the tutorials, or the ones I’ve watched, happen on Colab where it seems the python api gets used more. But for anything more than a toy case (in DL projects), you’ll need more than Colab to make progress., and this isn’t covered as much.
Critical Missing Features:
(a) Copy and pasting numeric values from plots on the web interface.
I ran hundreds of runs in my sweep. W&B helped me visualize them in these great pre-made charts. I hover over an object on the chart, and the numeric values pops up. Excellent. I now need to get this number into a python script to make my own plots. I go to copy and paste that value…but I can’t. I need to crane my neck to look close, and flip between my python screen and W&B screen to manually enter everything (and if I zoom in to make the number bigger, the chart auto-resizes and screws everything up), and make sure I got all 7 significant digits are correctly entered across my 200 runs (at 3 am before a deadline)… brutal. Please allow me to click and copy that value, my neck will thank you.
(b) Conditional sampling.
It would be awesome to be able to seamlessly integrate conditional distributions in the search.
-Suppose I have a NN that has two configurations, and in the first configurations I have an extra hyperparameter that I need to tune, which does not appear in the second. W&B cannot currently handle such a joint HP-search (other than in a random/grid configuration, where we keep sampling values for this hyperparameter that don’t get used, and can make interpreting the parameter importance plots confusing). I am forced to split the search into two searches, which makes viewing the results that much harder (at the end of the day, I only want the best configuration, not the best of the first and the best of the second).
-Suppose I have 3 datasets and 2 model configurations. It would be awesome if in one search I could specify “do a bayesian search to find the best model config for each dataset.” Instead of splitting them into different searches. I found myself creating ever more YAML files to achieve this.
(c) Sharing Visualizations Between Projects.
This is related to the last missing feature. Because I had to keep splitting the searches up into smaller individual searches, I kept getting more and more projects, where I had to create the same charts over and over again. Now it could be that I’m just not an advanced user yet, and I could have them all in one project already, etc (let me know if this is the case), but I feel I should be able to kind of drag and drop figure setups between projects where the values used in the plots are shared between all runs in both projects.
***If any W&B developers are reading this and want to try to reproduce, the configurations I had this problem with was using Pytorch Lightning callback with an Nvidia T4 GPU on (i) AWS servers and (ii) Google Colab. Try setting up a hyperparameter search using the python api, then try using yaml files and the cli api. A related problem you may notice on some platforsm is the printing of
ValueError(‘signal only works in main thread’) when using python api. Sometimes it’s only printed, sometimes it seems to crash the sweep. I posted about this previously here.