I have a project where I’m trying to train multiple models in an interspersed fashion with custom axes for each model, as well as an overall global step axis (different from the builtin step).
I am having much difficulty in getting the correct thing to show up (or sometimes, to get anything to show up at all) on the horizontal axis of my plots.
In the simplest case, I’d just like to be able to do:
, which is what I understand is being said in the instructions here.
However, what seems to happen when I do this is that everything ends up with one of either the Wandb Step or the global_datpoint_count on the x-Axis, and when I try to select, say model_A_datapoint_count, it says “There’s no data for the selected runs.
Try a different X axis setting.”
I suspect it might have to do with not logging the step count and the datapoint at the wandb logging call (possibly)? But I’m not sure that’s it, nor what to do about it if it is.
When you define a metric with a custom step metric, you need to ensure that both the metric and its custom step are logged together in the same wandb.log call. This ensures that the data points are correctly aligned with their respective steps on the x-axis.
Here’s how you can modify your logging to ensure that the custom step metrics are logged correctly:
import wandb
# Initialize your W&B run
wandb.init(project="your_project_name")
# Define the global step metric and the custom step metrics for each model
wandb.define_metric("global_datapoint_count")
wandb.define_metric("model_a_datapoint_count")
wandb.define_metric("model_b_datapoint_count")
# Define the metrics with their corresponding step metrics
wandb.define_metric("*", step_metric="global_datapoint_count", step_sync=True)
wandb.define_metric("model_a/*", step_metric="model_a_datapoint_count", step_sync=True)
wandb.define_metric("model_b/*", step_metric="model_b_datapoint_count", step_sync=True)
# Example training loop
for i in range(num_iterations):
# Simulate training and logging for model A
model_a_loss = ... # Compute loss for model A
model_a_step = ... # Compute step for model A
wandb.log({
"global_datapoint_count": i,
"model_a_datapoint_count": model_a_step,
"model_a/loss": model_a_loss
})
# Simulate training and logging for model B
model_b_loss = ... # Compute loss for model B
model_b_step = ... # Compute step for model B
wandb.log({
"global_datapoint_count": i,
"model_b_datapoint_count": model_b_step,
"model_b/loss": model_b_loss
})
# Finish the run
wandb.finish()
does this approximately look like how you’re logging your metrics?
Hi Nikolaus, since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!
Based on your code snippet, it looks like you’re on the right track with defining custom metrics and using wandb.define_metric().
To ensure that your custom metrics show up correctly on the horizontal axis, you’ll indeed need to make sure that you’re logging the appropriate data at each logging call in your code. This includes logging both the step count and the corresponding data point counts for each model.
If you suspect that logging might be the issue, double-check your logging calls to ensure that you’re logging the necessary data correctly. Additionally, make sure that you’re logging data consistently across all models and for the global data point count.
If you’re still encountering issues after verifying your logging calls, I’d recommend reaching out to the Weights & Biases support team directly via email at supportwandb. They’ll be able to provide more targeted assistance and help troubleshoot any issues you’re facing with setting up your custom axes.
It sounds like the issue might be related to how the metrics are being logged. One possible reason you’re seeing the error is that the step count for model_A_datapoint_count and model_B_datapoint_count might not be logged correctly when you are calling wandb.log().
Double-check that global_datapoint_count and the respective model_*_datapoint_count are being logged consistently. If they’re not updated in sync with each log, it can cause discrepancies in the axes.
If possible, try setting the step metric directly within the logging call as well: