Feature Request: Head-to-head comparisons for Weave evaluations

joaomendonca_lw · July 3, 2024, 10:40am

I really wanted to switch from LangSmith to weave, but my team wants a feature like this one on LangSmith where we can see direct comparisons between different evaluation runs. Additionally, filtering on cases where performance gets worse/better.

Does this happen to be in the works?

paulo-sabile · July 4, 2024, 6:19am

Hi @joaomendonca_lw Good day and thank you for reaching out to us! Happy to help you on this.

We have a feature called Run Comparer that allows you to see what metrics are different across your runs. You can check this link for some guidance and there’s also a live example available. Let me know if this tool can help. Otherwise, we’ll be more than happy to raise a feature request for you.

Thanks,
Paulo

paulo-sabile · July 10, 2024, 3:04am

Hi @joaomendonca_lw , since we have not heard back from you we are going to close this request. If you would like to re-open the conversation, please let us know!

Topic		Replies	Views
Is it possible to share a comparison between Weave Evaluations publicly? [Follow-up] W&B Help wandb	1	28	December 5, 2024
Compare prediction/summary scores on a table W&B Help projects , wandb	4	832	April 7, 2023
Is it possible to share a comparison between Weave Evaluations publicly? W&B Help wandb	13	89	December 5, 2024
[Solved] How to create model comparison table W&B Help reports , tables , wandb	3	1579	July 12, 2022
How to only track evaluations with Weave? W&B Help wandb	0	33	February 17, 2025

Feature Request: Head-to-head comparisons for Weave evaluations

Related topics