Filtering runs by job_type using API is not working

Hi everyone!

I am trying to retrieve filtered runs from a project using the WandB API and a filter dictionary.

I try to do the following:

api = wandb.Api()
filter_dict = {"job_type":  "my_job_type"}
runs = api.runs("my_entity/my_project", filters=filter_dict)
for run in runs:
    print(run)

When I do this, I get the following error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\apis\public.py", line 980, in __next__
    if not self._load_page():
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\apis\public.py", line 965, in _load_page
    self.last_response = self.client.execute(
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\sdk\lib\retry.py", line 168, in wrapped_fn
    return retrier(*args, **kargs)
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\sdk\lib\retry.py", line 108, in __call__
    result = self._call_fn(*args, **kwargs)
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\apis\public.py", line 207, in execute
    return self._client.execute(*args, **kwargs)
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\vendor\gql-0.2.0\wandb_gql\client.py", line 52, in execute
    result = self._get_result(document, *args, **kwargs)
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\vendor\gql-0.2.0\wandb_gql\client.py", line 60, in _get_result
    return self.transport.execute(document, *args, **kwargs)
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\wandb\vendor\gql-0.2.0\wandb_gql\transport\requests.py", line 39, in execute
    request.raise_for_status()
  File "C:\Users\KoljaBauer\anaconda3\lib\site-packages\requests\models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://api.wandb.ai/graphql

However, other filters do work. For example I can do the above described procedure with

filter_dict = {"group":  "my_group"}

and it yields the correctly filtered jobs.

What I am currently doing as a workaround is this:

api = wandb.Api()
runs = api.runs("my_entity/my_project")
for run in runs:
    if run.job_type == "my_job_type":
        print(run)

However, I would prefer to directly filter the runs with the API call. Any idea what I am doing wrong?

Thanks in advance!

Hi @kolja thank you for writing in! Could you please try if the following would work for you?

api = wandb.Api()
filter_dict = {"jobType":  "my_job_type"}
runs = api.runs("my_entity/my_project", filters=filter_dict)
for run in runs:
    print(run)

The queries in filters are using the MongoDB query language. Hope this helps!

1 Like

Hi @thanos-wandb thanks a lot for your reply!

This solved my problem.

Are those keys included in the documentation somewhere? I find it confusing that you can access them in Python via my_job.job_type and my_job.name, but in the query those parameters must be called jobType and display_name. I could not find any mention of jobType in the documentation and I only found display_name in some code example by coincidence.

Hi @kolja glad to hear this is now fixed! Indeed this part is not well documented but our docs are continuously updated. Another source to get these information would be our GitHub repository such as here. I hope this helps!

I am closing the ticket for now as is resolved, but please feel free to reopen it if you have more questions related to this issue and I will be glad to assist further.