Thoughts and Feedback on W&B

Having just finished a paper where I found W&B an integral tool, I thought I’d summarize my thoughts while they’re still fresh.

  1. W&B is awesome, and made what would have been a implausible project manageable.

  2. Double Charles’ salary and get that man to produce 2-3x times more tutorials/streams/walk throughs. His hp-sweep tutorials is what really pushed me over the edge to start using W&B, and integrate it into the rest of my project. We could definitely use more tutorials on scaling projects across many machines (which isn’t super hard once I got it working with YAML config, but was with python api), walking through typical real ML project (tuning proposed model, tuning baseline comparisons, walk through web interface on how to gain insight into the runs, how to organize all the runs together coherently, how to use stored checkpoint to reload model and run when the reviewers bash your paper and want more experiments, etc…).

  3. Criticism: I have found the python api to be very unstable, particularly when moving across platforms. I believe it to be the cause of crashed/stalled GPUs (although I’m not sure how, setup below ***) when running hyperparameter sweeps. YAML configurations and CLI usage is by far more stable, scalable, and better integrated into the web interface. I think this needs to be better communicated in the documentation and tutorials. Many of the tutorials, or the ones I’ve watched, happen on Colab where it seems the python api gets used more. But for anything more than a toy case (in DL projects), you’ll need more than Colab to make progress., and this isn’t covered as much.

  4. Critical Missing Features:
    (a) Copy and pasting numeric values from plots on the web interface.
    I ran hundreds of runs in my sweep. W&B helped me visualize them in these great pre-made charts. I hover over an object on the chart, and the numeric values pops up. Excellent. I now need to get this number into a python script to make my own plots. I go to copy and paste that value…but I can’t. I need to crane my neck to look close, and flip between my python screen and W&B screen to manually enter everything (and if I zoom in to make the number bigger, the chart auto-resizes and screws everything up), and make sure I got all 7 significant digits are correctly entered across my 200 runs (at 3 am before a deadline)… brutal. Please allow me to click and copy that value, my neck will thank you.
    (b) Conditional sampling.
    It would be awesome to be able to seamlessly integrate conditional distributions in the search.
    -Suppose I have a NN that has two configurations, and in the first configurations I have an extra hyperparameter that I need to tune, which does not appear in the second. W&B cannot currently handle such a joint HP-search (other than in a random/grid configuration, where we keep sampling values for this hyperparameter that don’t get used, and can make interpreting the parameter importance plots confusing). I am forced to split the search into two searches, which makes viewing the results that much harder (at the end of the day, I only want the best configuration, not the best of the first and the best of the second).
    -Suppose I have 3 datasets and 2 model configurations. It would be awesome if in one search I could specify “do a bayesian search to find the best model config for each dataset.” Instead of splitting them into different searches. I found myself creating ever more YAML files to achieve this.
    (c) Sharing Visualizations Between Projects.
    This is related to the last missing feature. Because I had to keep splitting the searches up into smaller individual searches, I kept getting more and more projects, where I had to create the same charts over and over again. Now it could be that I’m just not an advanced user yet, and I could have them all in one project already, etc (let me know if this is the case), but I feel I should be able to kind of drag and drop figure setups between projects where the values used in the plots are shared between all runs in both projects.

***If any W&B developers are reading this and want to try to reproduce, the configurations I had this problem with was using Pytorch Lightning callback with an Nvidia T4 GPU on (i) AWS servers and (ii) Google Colab. Try setting up a hyperparameter search using the python api, then try using yaml files and the cli api. A related problem you may notice on some platforsm is the printing of ValueError(‘signal only works in main thread’) when using python api. Sometimes it’s only printed, sometimes it seems to crash the sweep. I posted about this previously here.


Hey Max, thanks for your feedback! It’s great to hear all these thoughts summarized.

  1. Praise: Yay! Glad you like it

  2. Charles: Awesome, so glad you love his videos!

  3. Python API: Could you please tell me more about what issues you saw? If you can help me reproduce these issues I’d like to get them fixed asap.

4a: Click to copy chart numbers: Is there a visualization tool you use that does this interaction well? I’m not sure how we could let you click to copy numbers from the tooltip since it follows your mouse.

4b. Split Sweeps: Would you be able to look at the main project page to compare all runs across two sweeps, rather than looking at each one individually? How are you evaluating the best run across sweeps? Is it by sorting the table?

4c. Many projects: It sounds like you were putting each search into its own project — I recommend that you put all your runs in one project to compare all of them, and then use tags to indicate subsets of runs that go together.

It sounds like “signal only works on main thread” might have happened if you called wandb.log() from a process where you didn’t call wandb.init(). Do you think this possibly happened in your code?