Barchart Grouping by Time/Step/Count

Dear W&B Community,

I have system metrics logged like the “time per step” or “time per backward pass” for a model.
When doing this on different hardware, I would like to compare the effect this has on these metrics.
In the following examples, I profile the basic Torch CIFAR10 model on a 1,2,4,8,16 and 32 CPU VM.

When looking at a Linechart, the full history of these metrics is visible, however, it is very hard to compare them due to the overlapping and oscillation:

When using a Barchart, only the last value is visualized:

The functionality that would be nice is to group values based on their count or occurrence, as grouping by runs already works perfectly. Here’s the same data but run through seaborn.barplot:

Would this be possible to implement? Or does anybody know a way to get that functionality?

My current workaround is to download the data manually and run it through seaborn. Unfortunately, I did not understand the errors I’ve gotten with the Custom Chart functionality when trying to port Vega examples to use wandb as a data basis.

I’d be very glad if anybody can point me to a tutorial on how to migrate existing Vega examples to be used with wandb (and the common problems, like differences between v3/v4/v5, as these seemed to be an issue for me).

After fiddling a bit more with the Custom Chart functionality, I got a half-baked solution.

The biggest issue when trying to implement a Custom Chart is to watch out for the difference between vega and vega-lite.

Examples from the vega-lite documentation are mostly easily adaptable while the ones from vega DO NOT WORK.

I didn’t dig quite deep enough to understand why, but it seems to have something to do with the way wandb provides its data.

However, here’s the code for anybody interested in the following boxplot. It has basic tooltips and a configurable range. If you want the 1.5 IQR boxplot instead of the min-max one, simply remove this line "extent": "min-max".

Configuration:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A simple box plot from https://vega.github.io/vega-lite/examples/boxplot_2D_vertical.html, refer to the full documentation here https://vega.github.io/vega-lite/docs/boxplot.html",
  "title": "${field:y}",
  "data": {
    "name": "wandb"
  },
  "layer":[
  { 
    "mark": {
      "type": "boxplot",
      "extent": "min-max",
      "clip": true,
      "median": { "color": "black" },
      "ticks": true
    },
    "encoding": {
      "size":{ "value": 25},
      "x": {
        "field": "${field:group_by}",
        "type": "nominal",
        "axis": {"labelAngle": -25}
      },
      "y": {
        "field": "${field:y}",
        "type": "quantitative",
        "sort": "-y",
        "scale": {"domain": ["${string:min_domain}", "${string:max_domain}"]},
        "title": null
      },
      "color": {
        "field": "${field:group_by}",
        "type": "nominal"

      },
      "tooltip": [
        { "field": "${field:y}", "type":"quantitative" }
      ]
    }
  }
  ],
  "config": {
    "axis": { "grid": true },
    "view": {
      "stroke": "transparent"
    }    
  }
}

Some things to watch out for:

  • vega-lite seems to be a wrapper around vega, and the boxplot primitive creates a lot of layers that are not modifiable by default (as do other primitives)
  • I presume that this is the reason I did not get any transforms working that are in the default Line plot
  • the GraphQL input in the second image is sometimes buggy and does not allow you to select the correct field but you can work around that by typing something in the first field, then <Tab> and selecting the second field, then deleting the first one (Brave Build: Version 1.42.97 Chromium: 104.0.5112.102)
  • When in the “Chart Definition” view (Custom Chart -> Edit), data recomputation is delayed/buggy. If you change the chart code, you might need to re-select the fields to see the changes (e.g., group_by for me)
  • When in the “Chart Definition” view, just below the chart is a dropdown list of the generated/imported data. Use that extensively to make sure that your transforms do what you want them to do!

A nice improvement from this boxplot would be to have violin plots, but these still seem to be in active development, see this issue.

Final thoughts: I would still love to have a group-by time/count in the default Bar chat as it would integrate much more tightly with the remaining look of the dashboard and would provide violin plots by default.

Hi Alexander,

Thank you very much for sharing this, it is really useful! It seems that you have solved your issue but let me know if I can help you in any way! Also, I was wondering if you could send me an example of a Vega example that is not working and so I can check what is happening there! Thanks!

Best,
Luis

Hi Luis,

thanks for taking the time! I think the best solution would be to allow grouping over time in the W&B GUI for your integrated bar charts, but for now, a big help for the community would be an example of making a vega example run with wandb.

Here’s the basic bar chart example from the official page, adapted to use with wandb by including the ${field:x} and ${field:y}.

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "description": "A simple line plot",
  "data": {
    "name": "wandb"
  },
  "signals": [
    {
      "name": "tooltip",
      "value": {},
      "on": [
        {"events": "rect:mouseover", "update": "datum"},
        {"events": "rect:mouseout",  "update": "{}"}
      ]
    }
  ],

  "scales": [
    {
      "name": "xscale",
      "type": "band",
      "domain": {"data": "table", "field": "${field:x}"},
      "range": "width",
      "padding": 0.05,
      "round": true
    },
    {
      "name": "yscale",
      "domain": {"data": "table", "field": "${field:y}"},
      "nice": true,
      "range": "height"
    }
  ],

  "axes": [
    { "orient": "bottom", "scale": "xscale" },
    { "orient": "left", "scale": "yscale" }
  ],

  "marks": [
    {
      "type": "rect",
      "from": {"data":"table"},
      "encode": {
        "enter": {
          "x": {"scale": "xscale", "field": "${field:x}"},
          "width": {"scale": "xscale", "band": 1},
          "y": {"scale": "yscale", "field": "${field:y}"},
          "y2": {"scale": "yscale", "value": 0}
        },
        "update": {
          "fill": {"value": "steelblue"}
        },
        "hover": {
          "fill": {"value": "red"}
        }
      }
    },
    {
      "type": "text",
      "encode": {
        "enter": {
          "align": {"value": "center"},
          "baseline": {"value": "bottom"},
          "fill": {"value": "#333"}
        },
        "update": {
          "x": {"scale": "xscale", "signal": "tooltip.${field:x}", "band": 0.5},
          "y": {"scale": "yscale", "signal": "tooltip.${field:y}", "offset": -2},
          "text": {"signal": "tooltip.${field:y}"},
          "fillOpacity": [
            {"test": "datum === tooltip", "value": 0},
            {"value": 1}
          ]
        }
      }
    }
  ]
}

The error happens between line 4 and 6:
image

I haven’t been able to modify the code in any way to make this example work.