Multivariate Time Series Data

I am trying to use W&B in order to run experiments with satellite time series data. As we are working on a relatively large scale, we do not work with entire images, but rather one representative pixel for each ROI. This way we end up with a dataset consisting of num_samples amount of data points for t timesteps and each with num_bands channels.
One sample might look like that:

The 13 individual lines are the reflectance values for each spectral band over the course of a year for a single sample.

My question is the following:
Is there a wandb type that produces such representation as an entry in a table? I personally think that handling it as a plot is a bit of an overkill but on the other hand it seems like time series data can only be stored when it is univariate.

Thank you in advance!

Hi @maja601 thanks for writing in! It seems you’re looking for the wandb.Table data type, and you could directly log a Pandas dataframe directly as in this example. Then you will be able to use Weave to plot your multivariate time series. Please also check this Report from Stacey to explore available options with time series data. Would this work for you? Feel free to ask us any further questions.

Hi @thanos-wandb thank you so much for your quick reply!

Just to clarify, each data point has in total 13 ‘sub’-time series which makes my data 3D. As far as I understood, wandb.Table allows you to store a 2D array (e.g. rows are sample_id and cols are timesteps), but then the third dimension becomes an awkward array of numbers. Same when treating it like a dataframe; each cell would have to hold an entire array with 13 values, e.g. like that:

What I would like to have instead is a 2D data type that can go inside a table cell (a bit like a nested table), in such a way that the entire dataset looks like that
but instead of the image, the multidimensional time series as seen in my original question is visualised or a 2D table holding timesteps as cols and bands as rows is shown.

If this is possible with the examples you’ve shown me, please let me know. I am relatively new to W&B and I am not sure if I understood that all correctly.

Hi @maja601 thanks a lot for the clarification, I see what you’re trying to achieve. You could combine the wandb.Table where each row is a Plotly chart, would this work for you? or you could have the num_samples as a slider, let me know if you would prefer that. Please see below for an example:

import wandb
import random
import pandas as pd
import as px

num_samples = 5  # number of dataframes to create
t = 1000  # number of rows in each dataframe
num_bands = 12  # number of columns in each dataframe

# Initialize a new run
run = wandb.init(project="log-table-3d")

# Create a table
table = wandb.Table(columns = ["plotly_figure"])

for i in range(num_samples):
    # create dummy dataframe
  df = pd.DataFrame(
      data=[[random.random() for _ in range(num_bands)] for _ in range(t)],
      columns=[f"Band {j}" for j in range(1, num_bands+1)]
  #print(f"Sample {i+1}:\n{df}\n")  # print the dataframe for this sample

  # Create path for Plotly figure
  path_to_plotly_html = "./plotly_figure.html"

  # Example Plotly figure
  fig = px.line(data_frame = df)

  # Write Plotly figure to HTML
  fig.write_html(path_to_plotly_html, auto_play = False) # Setting auto_play to False prevents animated Plotly charts from playing in the table automatically

  # Add Plotly figure as HTML file into Table

# Log Table
run.log({"table": table}, step=i)


Is this something you would be interested at, or would you like us to proceed with a feature request to support nested tables?

Hi @thanos-wandb thank you for the proposed workaround! I am just wondering, whether that solution does create a bit too much overhead? In the domain of Earth Observation (EO), the type of data I showed you is fairly common and the amount of samples (rows) goes into the millions. Of course, we would not have to visualise all of them, but creating a Plotly chart for each sample seems a bit much.
In case W&B aims to support EO datasets, I would suggest the feature request route as a lot of scientists might benefit from it.

Hi @maja601 thanks so much for the detailed explanation of your use case. I agree for that order of data, it would make the Workspace very slow to render all these plots. Therefore, I have proceeded with a feature request and linked it to that thread here, so that we can share with you any updates. Thank you for this very useful suggestion.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.